PHP
downloads | documentation | faq | getting help | mailing lists | licenses | wiki | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

stripcslashes> <strcspn
Last updated: Sun, 25 Nov 2007

view this page in

strip_tags

(PHP 4, PHP 5)

strip_tags — Strip HTML and PHP tags from a string

说明

string strip_tags ( string $str [, string $allowable_tags ] )

This function tries to return a string with all HTML and PHP tags stripped from a given str . It uses the same tag stripping state machine as the fgetss() function.

参数

str

The input string.

allowable_tags

You can use the optional second parameter to specify tags which should not be stripped.

Note: HTML comments and PHP tags are also stripped. This is hardcoded and can not be changed with allowable_tags .

返回值

Returns the stripped string.

更新日志

版本 说明
5.0.0 strip_tags() is now binary safe
4.3.0 HTML comments are now always stripped
4.0.0 The allowable_tags parameter was added

范例

Example#1 strip_tags() example

<?php
$text 
'<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo 
strip_tags($text);
echo 
"\n";

// Allow <p> and <a>
echo strip_tags($text'<p><a>');
?>

上例将输出:

Test paragraph. Other text
<p>Test paragraph.</p> <a href="#fragment">Other text</a>

注释

Warning

Because strip_tags() does not actually validate the HTML, partial, or broken tags can result in the removal of more text/data than expected.

Warning

This function does not modify any attributes on the tags that you allow using allowable_tags , including the style and onmouseover attributes that a mischievous user may abuse when posting text that will be shown to other users.



stripcslashes> <strcspn
Last updated: Sun, 25 Nov 2007
 
add a note add a note User Contributed Notes
strip_tags
matt at lvi dot org
03-Jan-2009 01:56
I have a correction for mdw252's whitelist-based stripAttributes function.

Using that function, if a user sends one of the following strings to the server, you get some undesired output, with varying levels of severity:
1. xss vulnerablities:
  <div onload..;,;.."xss.attack();">, <a href="javascript:xss.attack();">, <div onclick=";xss.attack();">
2. some characters break the html:
  <div style="border:1px solid blue;">

I believe that the function below takes care of those issues and is a little more flexible by allowing some parameters.

By default, the function will strip all attributes.  So, you could sanitize a string this way:
$string=stripAttributes($string);

To acheive the same as mdw252's function, which is to allow the id and class attributes as well as href on links, you would:
$allowable = array('class','id');
$exceptions = array('a'=>'href');
$string=stripAttributes($string,$allowable,$exceptions);

Or, if you wanted to
  - allow the "class" and "style" attributes generally,
  - the "align" attribute only on table cells, and
  - specify possible values for the align attribute,
you would:

$allowable = array('class','style');
$exceptions = array('table'=>'width','td'=>'align');
$values = array('align'=>array('left','center','right'));
$string=stripAttributes($string,$allowable,$exceptions,$values);

<?php
   
function stripAttributes($string, $allowable = NULL, $exceptions = NULL, $values = NULL, $nohrefevents= true, $crs=NULL) {

     
$string=str_replace('..;,;..', '=', $string);
      if (!
$crs)
       
$crs = 'a-zA-Z0-9 \>\<\-:;\(\)\.\,\/=\&';
     
$string=preg_replace('/[^'.$crs.'\'"]/i','',$string);
     
$string=preg_replace('/(<.*) (.*=)(['.$crs.']*) (.*>)/', '${1} ${2}"${3}" ${4}', $string);
      if (
$nohrefevents)
       
$string=preg_replace('/(<a .* )href="(javascript:.*>)/', '${1}onclick=${2}', $string);
     
     
//generally allowed attributes
     
if (is_array($allowable)){
        foreach (
$allowable as $allowed)
         
$string=preg_replace('/(<.* )'.$allowed.'=(.*>)/', '${1}'.$allowed.'..;,;..${2}', $string);
      }
     
     
//tag by tag exceptions
     
if (is_array($exceptions)){
        foreach (
$exceptions as $tag=>$attribute){
         
$string=preg_replace('/(<'.$tag.' ?.* )'.$attribute.'=(.*>)/', '${1}'.$attribute.'..;,;..${2}', $string);
        }
      }
     
     
//specified attribute values
     
if (is_array($values)){
        foreach (
$values as $attribute=>$value){
          if (
is_array($value)){
            foreach (
$value as $val)
              while(
preg_match('/(<.*) '.$attribute.'=(\'|")'.$val.'(\'|".*>)/', $string)) $string=preg_replace('/(<.*) '.$attribute.'=(\'|")'.$val.'(\'|".*>)/', '${1} '.$attribute.'..;,;..${2}'.$val.'${3}', $string);
          }
        }
      }

      while(
preg_match('/(<.*) .*=(\'|")(['.$crs.']*)(\'|")(.*>)/', $string)) $string=preg_replace('/(<.*) .*=(\'|")(['.$crs.']*)(\'|")(.*>)/', '${1}${5}', $string);
     
$string=str_replace('..;,;..', '=', $string);
     
      return
$string;
    }
?>
mariusz.tarnaski at wp dot pl
13-Nov-2008 12:05
Hi. I made a function that removes the HTML tags along with their contents:

Function:
<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {

 
preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
 
$tags = array_unique($tags[1]);
   
  if(
is_array($tags) AND count($tags) > 0) {
    if(
$invert == FALSE) {
      return
preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return
preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif(
$invert == FALSE) {
    return
preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return
$text;
}
?>

Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

Result for strip_tags($text):
sample text with tags

Result for strip_tags_content($text):
 text with

Result for strip_tags_content($text, '<b>'):
<b>sample</b> text with

Result for strip_tags_content($text, '<b>', TRUE);
 text with <div>tags</div>

I hope that someone is useful :) The exact explanation for Polish PHP programmers at http://www.tarnaski.eu/blog/rozszerzone-strip_tags/
lucky760 at VideoSift dot com
21-Oct-2008 01:21
It's come to my attention that PHP's strip_tags has been doing something funky to some video embed codes that our members submit. I'm not sure the exact situation, but whenever there is a <param> tag that is very long, strip_tags() will completely remove the tag even though it's specified as an allowable tag.

Here's an example of the existing problem:
<?php
// a single very long <param> tag
$html =<<<EOF
<param name="flashVars" value="skin=http%3A//cdn-i.dmdentertainm
...
[snip]...
vie%20of%20All-Time"/>
EOF;

echo
strip_tags($html, '<param>');
// this outputs an empty string
?>

This is the function I built to fix and extend the functionality of strip_tags(). The args are:
- $i_html - the HTML string to be parsed
- $i_allowedtags - an array of allowed tag names
- $i_trimtext - whether or not to strip all text outside of the allowed tags

<?php

function real_strip_tags($i_html, $i_allowedtags = array(), $i_trimtext = FALSE) {
  if (!
is_array($i_allowedtags))
   
$i_allowedtags = !empty($i_allowedtags) ? array($i_allowedtags) : array();
 
$tags = implode('|', $i_allowedtags);

  if (empty(
$tags))
   
$tags = '[a-z]+';

 
preg_match_all('@</?\s*(' . $tags . ')(\s+[a-z_]+=(\'[^\']+\'|"[^"]+"))*\s*/?>@i', $i_html, $matches);

 
$full_tags = $matches[0];
 
$tag_names = $matches[1];

  foreach (
$full_tags as $i => $full_tag) {
    if (!
in_array($tag_names[$i], $i_allowedtags))
      if (
$i_trimtext)
        unset(
$full_tags[$i]);
      else
       
$i_html = str_replace($full_tag, '', $i_html);
  }

  return
$i_trimtext ? implode('', $full_tags) : $i_html;
}
?>

And here's an example with the a block of full video embed code with <object><embed><param> and some extraneous HTML:

<?php
$html
=<<<EOF
<em><div><object type="application/x-shock
...
[snip]...
me.html">Wal-Mart Makes The Worst Movie of All-Time</a> -- powered by whatever</div></em>
EOF;

$good_html = real_strip_tags($html, array('object', 'embed', 'param'), TRUE);

?>

Now $good_html contains only the specified tags and none of the "powered by" type text. I hope someone finds this as useful as I needed it to be. :)
southsentry at yahoo dot com
26-Sep-2008 12:15
I was looking for a simple way to ban html from review posts, and the like. I have seen a few classes to do it. This line, while it doesn't strip the post, effectively blocks people from posting html in review and other forms.

<?php
if (strlen(strip_tags($review)) < strlen($review)) {
    return
false;
}
?>

If you want to further get by the tricksters that use & for html links, include this:

<?php
if (strlen(strip_tags($review)) < strlen($review)) {
        return
false;
} elseif (
strpos($review, "&") !== false) {
        return
5;
}
?>

I hope this helps someone out!
valentin -DOT- moreira -AT- atapear.com
15-Sep-2008 11:43
The following function has a small error:

<?
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si'// Strip out javascript
              
'@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
              
'@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
              
'@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return
$text;
}
?>

It´s a great function and work fine!, but don´t erase the inline <style> code.

This function only works 100% fine changing the regexp order to this:

<?
function html2txt($document){
$search = array('@<script[^>]*?>.*?</script>@si'// Strip out javascript
              
'@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
              
'@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags
              
'@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
);
$text = preg_replace($search, '', $document);
return
$text;
}
?>
mdw252 at psu dot edu
14-Sep-2008 02:58
I think I may have come up with a pretty simple white-list based approach to attribute management. It's kind of hack-ish, but it's been pretty resilient with everything I've thrown at it. Check it out:

function stripAttributes($string)
{
        $string=preg_replace('/(<a.*)href=(.*>)/', '${1}href..;,;..${2}', $string);
        $string=preg_replace('/(<.*)id=(.*>)/', '${1}id..;,;..${2}', $string);
        $string=preg_replace('/(<.*)class=(.*>)/', '${1}class..;,;..${2}', $string);
        while(preg_match('/(<.*) .*=(\'|"|\w)\w*(\'|"|\w)(.*>)/', $string)) $string=preg_replace('/(<.*) .*=(\'|"|\w)\w*(\'|"|\w)(.*>)/', '${1}${4}', $string);
        $string=str_replace('..;,;..', '=', $string);
        return $string;
}

As you can see, I have it set to only allow href (in <a> tags), id, and class attributes, and everything else will be deleted. It should be pretty self-explanatory to customize it to your own purposes.
Liam Morland
24-Aug-2008 08:58
Here is a suggestion for getting rid of attributes: After you run your HTML through strip_tags(), use the DOM interface to parse the HTML. Recursively walk through the DOM tree and remove any unwanted attributes. Serialize the DOM back to the HTML string.

Don't make the default permit mistake: Make a list of the attributes you want to ALLOW and remove any others, rather than removing a specific list, which may be missing something important.
Logic
16-Jun-2008 06:58
Remember sometimes with regex it's easier to list what you want to keep instead of everything you do not want. Not to mention this makes it easier on the server. With html attributes there may only be a select few you would want to keep.

Rather than an array that says
$trash = array('a','b','d','f','g');

You could use
$keep = array('c','e');

Simply remember your ^not operator in your final regex.
razonklnbd at hotmail dot com
10-Jun-2008 05:45
When I attempt to use strip_tags it didn't strip text of that string. But I need to strip text all the text into an html page header code. This function will perform it operation like following way...
1. Check if string contain "<body>" tag
2. If found then keep the body text and remove other staff like css, js or any
3. Then do strip_tag function

Its a small but handy function... so I like to share.

Function Definition:

function extractBodyText($p_str, $p_allowedtag=NULL){
    $fstr=(preg_match('/<body[^>]*>(.*?)<\/body>/si', $p_str, $regs)?$fstr=$regs[1]:$p_str);
    $rtrn=(isset($p_allowedtag)?strip_tags($fstr, $p_allowedtag):strip_tags($fstr));
    return $rtrn;
}

Example:

$str01='
     <dd>

      <p class="para">
       You can use the optional second parameter to specify tags which should
       not be stripped.
      </p>
      <blockquote><p><b class="note">Note</b>:
      
        HTML comments and PHP tags are also stripped. This is hardcoded and
        can not be changed with <i><tt class="parameter">allowable_tags</tt></i>.
       <br />
      </p></blockquote>

     </dd>
';
$str='<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>

<body>
'.$str01.'
</body>
</html>
';
echo('<div>'.extractBodyText($str, '<p>').'</div>');
echo('<div>'.extractBodyText($str01).'</div>');
echo('<div>'.strip_tags($str, '<p>').'</div>');

Result:
<div>

    

      <p class="para">
       You can use the optional second parameter to specify tags which should
       not be stripped.
      </p>
      <p>Note:
      
        HTML comments and PHP tags are also stripped. This is hardcoded and
        can not be changed with allowable_tags.
      
      </p>

    

</div><div>
    

     
       You can use the optional second parameter to specify tags which should
       not be stripped.
     
      Note:
      
        HTML comments and PHP tags are also stripped. This is hardcoded and
        can not be changed with allowable_tags.
      
     

    
</div><div>

Untitled Document

    

      <p class="para">
       You can use the optional second parameter to specify tags which should
       not be stripped.
      </p>
      <p>Note:
      
        HTML comments and PHP tags are also stripped. This is hardcoded and
        can not be changed with allowable_tags.
      
      </p>

    

</div>
ZlobnyNigga
22-May-2008 12:14
Attempts to write stip_tags_attributes function looks like endless loop of finding vulnerabilities in function, patching them, then again vulnerabilities, then again patch...
I decided to use HTML_Safe package from http://pear.php.net/package/HTML_Safe
I works fine, but, of course, it is slower then functions written below. You decide =)
Massoud Abbagash
08-May-2008 06:56
Danno, your script has a flaw.

Try this :

<?php

function strip_tags_keep_links($sSource)
    {
        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/\b((?![hH][rR][eE][fF]\b)\w+)[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource,'<a>'));
    }

$source = "<a href=javascript:alert('doesn\'t&nbsp;work!') title=\"move your mouse here\" href=http://www.a_web_site.org onmouseover\n=\nalert(\"doesn\'t&nbsp;work!\")  onmouseover='alert(\"doesn\'t&nbsp;work!\")' alt=\"move your mouse here\" > test</a>";

$result=strip_tags_keep_links($source);

echo(
$result);

?>
Massoud Abbagash
07-May-2008 06:26
There is still a flaw in your function.
Look at this, the [onmouseover] sample script below remains. Even after the treatment with the function [strip_tags_attributes].

<?php

function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }

$source="<big onmouseover=alert('Hello!')>Move your mouse here (this doesn't work with [ strip_tags_attributes ])</big>";
$striped_source=strip_tags_attributes($source,array('<big>'));
echo(
$striped_source);

?>

Now,this is my correction:

<?php

function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/\s(' . implode('|', $aDisabledAttributes) . ').*?([\s\>])/', '\\2', preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags))) );
    }

$source="<big onmouseover=alert('Hello!')>Move your mouse here (this work with [ strip_tags_attributes corrected ])</big>";
$striped_source=strip_tags_attributes($source,array('<big>'));
echo(
$striped_source);

?>
bluej100@gmail
03-May-2008 03:31
Allowing user HTML while preventing XSS is non-trivial. Don't just try to hack together a regexp for it; at very least, check your solution against all of the ha.ckers.org exploit examples:

http://ha.ckers.org/xss.html

Really, though, you should be using a solid library that recognizes tags, attributes, and styles from a whitelist and rebuilds the markup from scratch. HTMLPurifier has a "linkify" option that does what you're looking for.

http://htmlpurifier.org
LK
20-Apr-2008 07:30
Concerning all of the notes about which attributes to include in strip_tags_attributes(), the latest of which is by Kalle Sommer Nielsen:
Correct me if I'm wrong, but isn't it a lot easier to simply reject any attribute that starts with "on"? Thus, the whole array of various javascript attributes could be replaced with "on\w+".
I am not aware of any non-javascript attributes that start with these two letters, and if there were, it would be easier to make an exception for them than for the countless JS attributes.
Danno
09-Apr-2008 04:20
Hi everyone,

I came across this thread looking for a way to strip out all tags but links and leaving only the HREF attribute. I took what you guys have worked on and made it allow only the HREF attribute. This way even if the spec changes you are sure to not let any javascript sneak in, who knows what the future will bring :P . So I think its pretty tight, take a look at it and modify if you see any holes.

<?php

function strip_tags_keep_links($sSource)
    {
        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/\b((?![hH][rR][eE][fF]\b)\w+)[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource,'<a>'));
    }

?>
Kalle Sommer Nielsen
31-Mar-2008 06:05
This adds alot of missing javascript events on the strip_tags_attributes() function from below entries.

Props to MSDN for lots of them ;)

<?php
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>
sych
13-Mar-2008 11:26
brian, this solution is not good, because there are events that you will forget any way. Like, with this code you are vulnerable to attr "onMouseEnter" and tons of others that actually exist in javascript specs.
brian at diamondsea dot com
04-Mar-2008 03:47
An update agolna's update to sbritton's function:

Adds additional javascript events to the aDisabledAttributes array.

<?php
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onabort', 'onblue', 'onchange', 'onclick', 'ondblclick', 'onerror', 'onfocus', 'onkeydown', 'onkeyup', 'onload', 'onmousedown', 'onmousemove', 'onmouseover', 'onmouseup', 'onreset', 'onresize', 'onselect', 'onsubmit', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>
agolna at gmail dot com
29-Feb-2008 11:37
An update to sbritton's function:

If you have whitespace between the = sign and the attribute, it would bypass the regex.  This updates that.

<?php
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onclick', 'ondblclick', 'onkeydown', 'onkeypress', 'onkeyup', 'onload', 'onmousedown', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>
ZlobnyNigga
22-Feb-2008 12:22
sbritton's function is not so good...
<?php
$str
= "<p onmouseover = 'alert(1);'>123</p>";
echo
strip_tags_attributes($str);
?>
sbritton
05-Feb-2008 02:35
The function below corrects a typo in y5's function to strip tags and attributes - it also adds lithium1330's recommended 's' parameter:

<?php
   
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onclick', 'ondblclick', 'onkeydown', 'onkeypress', 'onkeyup', 'onload', 'onmousedown', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onunload'))
    {
        if (empty(
$aDisabledAttributes)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")=[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>
lithium1330[(at)]msn.com
25-Jan-2008 08:02
Please note: in the code given by y5, Tony Freeman, tREXX [www.trexx.ch] and maybe others, you need to use the modifier "s" at the end of the preg_replace()'s regex (/ies) in order to strip attributes that have a line break before them, otherwise those attributes wont be stripped.
bstrick at gmail dot com
16-Jan-2008 01:52
This will strip all PHP and HTML out of a file.  Leaves only plain txt.

// Open the search file
$file = fopen($filename, 'r');
               
// Get rid of all PHP code.       
$search = array('/<\?((?!\?>).)*\?>/s');
       
$text = fread($file, filesize($filename));

$new = strip_tags(preg_replace($search, '', $text));

echo $new;

fclose($file);

- Strick
y5
16-Jan-2008 12:59
An improved version of tREXX and Tony Freeman's code, this keeps the code clean while removing unwanted attributes, including the javascript: protocol. Unlike the built-in strip_tags() function, this takes an array for allowed tags, rather than a string. For example: array('<a>', '<object>');

I don't understand why the built-in function uses a string.. oh well =)

<?php
   
function strip_tags_attributes($sSource, $aAllowedTags = array(), $aDisabledAttributes = array('onclick', 'ondblclick', 'onkeydown', 'onkeypress', 'onkeyup', 'onload', 'onmousedown', 'onmousemove', 'onmouseout', 'onmouseover', 'onmouseup', 'onunload'))
    {
        if (empty(
$aDisabledEvents)) return strip_tags($sSource, implode('', $aAllowedTags));

        return
preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . implode('|', $aDisabledAttributes) . ")=[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", strip_tags($sSource, implode('', $aAllowedTags)));
    }
?>
Matthieu Larcher
27-Jun-2007 11:44
I noticed some problems with the strip_selected_tags() function below, sometimes big chunks of contents where suppressed...
Here is a modified version that should run better.

<?php
function strip_selected_tags($text, $tags = array())
{
   
$args = func_get_args();
   
$text = array_shift($args);
   
$tags = func_num_args() > 2 ? array_diff($args,array($text))  : (array)$tags;
    foreach (
$tags as $tag){
        while(
preg_match('/<'.$tag.'(|\W[^>]*)>(.*)<\/'. $tag .'>/iusU', $text, $found)){
           
$text = str_replace($found[0],$found[2],$text);
        }
    }

    return
preg_replace('/(<('.join('|',$tags).')(|\W.*)\/>)/iusU', '', $text);
}

?>
birwin at suddensales dot com
23-Jun-2007 03:18
This is an upgrade to the illegal characters script by robt. This script will handle the input, even if the one or all of the fileds include arrays. Of course another loop could be added to handle compound arrays within arrays, but if you are savvy enough to be using compound arrays, you don't need me to rewrite the program.
<?
function screenForm($ary_check_for_html)
{
   
// check array - reject if any content contains HTML.
   
foreach($ary_check_for_html as $field_value)
    {       
        if(
is_array($field_value))
        {
            foreach(
$field_value as $field_array// if the field value is an array, step through it
           
{
           
$stripped = strip_tags($field_array);
                if(
$field_array!=$stripped)
                {
               
// something in the field value was HTML
               
return false;
                }
            }
        }else{
           
$stripped = strip_tags($field_value);
                if(
$field_value!=$stripped)
                {
               
// something in the field value was HTML
               
return false;
                }
            }
    }
    return
true;
}  
?>
geersc at hotmail dot com
12-May-2007 06:13
Hi,

I made the following adjustments to the "stripeentag()" function listed here.

Improvements are always welcome.

Regards,

Chris

<?php

function strip_attributes($msg, $tag, $attr, $suffix = "")
{                           
 
$lengthfirst = 0;
  while (
strstr(substr($msg, $lengthfirst), "<$tag ") != "")
  {
   
$tag_start = $lengthfirst + strpos(substr($msg, $lengthfirst), "<$tag ");       
   
   
$partafterwith = substr($msg, $tag_start);
   
   
$img = substr($partafterwith, 0, strpos($partafterwith, ">") + 1);
   
$img = str_replace(" =", "=", $img);                   
   
   
$out = "<$tag";
    for(
$i=0; $i < count($attr); $i++)
    {                 
      if (empty(
$attr[$i])) {
        continue;
      }                       
     
$long_val =
        (
strpos($img, " ", strpos($img, $attr[$i] . "=")) === FALSE) ?         
       
strpos($img, ">", strpos($img, $attr[$i] . "=")) - (strpos($img, $attr[$i] . "=") + strlen($attr[$i]) + 1) :
       
strpos($img, " ", strpos($img, $attr[$i] . "=")) - (strpos($img, $attr[$i] . "=") + strlen($attr[$i]) + 1);                                  
     
$val = substr($img, strpos($img, $attr[$i] . "=" ) + strlen($attr[$i]) + 1, $long_val);                                         
      if (!empty(
$val)) {
       
$out .= " " . $attr[$i] . "=" . $val;         
      }                     
    }
    if (!empty(
$suffix)) {
     
$out .= " " . $suffix;
    }       
   
   
$out .= ">";
   
$partafter = substr($partafterwith, strpos($partafterwith,">") + 1);           
   
$msg = substr($msg, 0, $tag_start). $out. $partafter;               
   
$lengthfirst = $tag_start + 3;
  } 
  return
$msg;
}

?>
lucky760 at yahoo dot com
23-Feb-2007 12:52
I needed a way to allow user comments to contain only hyperlinks as the only allowed HTML tags. This is easy enough to accomplish, but I also needed a way to convert full URLs into hyperlinks, and this complicated things a bit.

The functions below are not very elegant, but do the job. Function strip_tags_except() works similarly to the strip_selected_tags() function defined a few times on this page, but instead of allowing the user to specify the tags to strip, she can specify the tags to allow and strip all others. The third parameter, $strip, when TRUE removes "<" and ">" from the string and when FALSE converts them to "&lt;" and "&gt;" respectively.

Function url_to_link() simply converts full URLs into an equivalent hyperlink taking into consideration that users may end a URL with a character that's not actually part of the address.

When using both, url_to_link() should be called before strip_tags_except(). Here's an example as we are using it on http://www.VideoSift.com:
<?php
$summary
= url_to_link($summary);
$summary = strip_tags_except($summary, array('a'), FALSE);
?>
Here are the function definitions:
<?php
function strip_tags_except($text, $allowed_tags, $strip=TRUE) {
  if (!
is_array($allowed_tags))
    return
$text;

  if (!
count($allowed_tags))
    return
$text;

 
$open = $strip ? '' : '&lt;';
 
$close = $strip ? '' : '&gt;';

 
preg_match_all('!<\s*(/)?\s*([a-zA-Z]+)[^>]*>!',
   
$text, $all_tags);
 
array_shift($all_tags);
 
$slashes = $all_tags[0];
 
$all_tags = $all_tags[1];
  foreach (
$all_tags as $i => $tag) {
    if (
in_array($tag, $allowed_tags))
      continue;
   
$text =
     
preg_replace('!<(\s*' . $slashes[$i] . '\s*' .
       
$tag . '[^>]*)>!', $open . '$1' . $close,
       
$text);
  }

  return
$text;
}

function
url_to_link($text) {
 
$text =
   
preg_replace('!(^|([^\'"]\s*))' .
     
'([hf][tps]{2,4}:\/\/[^\s<>"\'()]{4,})!mi',
     
'$2<a href="$3">$3</a>', $text);
 
$text =
   
preg_replace('!<a href="([^"]+)[\.:,\]]">!',
   
'<a href="$1">', $text);
 
$text = preg_replace('!([\.:,\]])</a>!', '</a>$1',
   
$text);
  return
$text;
}
?>
rodt
16-Jan-2007 11:46
I have used this function successfully to prevent bots inserting HTML to web forms. Put the fields' contents into an array, then feed array to this function as an argument. Returns false if HTML is included; true if there is no HTML in any of the array's values. Hope it's helpful to someone.

    /*
    Checks that there is no HTML in any of provided fields.
    
     $ary_no_html_allowed = Array to check for HTML content.
    */
    function screenForm($ary_check_for_html){
        // check array - reject if any content contains HTML.
        foreach($ary_check_for_html as $field_value) {       
            $stripped = strip_tags($field_value);