日期:2014-05-17  浏览次数:20654 次

htmlentities和htmlspecialchars的区别(转载)

这两个函数的功能都是转换字符为HTML字符编码,特别是url和代码字符串。防止字符标记被浏览器执行。转换英文时二者不会出现问题,当转换中文时htmlentities()就会出现乱码。

区别:htmlentities转换所有的html标记,而htmlspecialchars只转换&、”、’、<、>这5个标记


$str = '<a href="demo.php?m=index&a=index&name=中文">测试页面</a>';

echo 'htmlentities指定GB2312编码:'.htmlentities($str,ENT_COMPAT,"GB2312").'';

echo 'htmlentities未指定编码:'.htmlentities($str).'';

$str = '<a href="demo.php?m=index&a=index&name=中文">测试页面</a>';

echo htmlspecialchars($str).'';

效果:

htmlentities指定GB2312编码:<a href="demo.php?m=index&a=index&name=中文">测试页面</a>

htmlentities未指定编码:<a href="demo.php?m=index&a=index&name=?D??">2aê?ò3??</a>

<a href="demo.php?m=index&a=index&name=中文">测试页面</a>



显示源代码:
htmlentities指定GB2312编码:&lt;a href=&quot;demo.php?m=index&amp;a=index&amp;name=中文&quot;&gt;测试页面&lt;/a&gt;<br/>htmlentities未指定编码:&lt;a href=&quot;demo.php?m=index&amp;a=index&amp;name=&Ouml;&ETH;&Icirc;&Auml;&quot;&gt;&sup2;&acirc;&Ecirc;&Ocirc;&Ograve;&sup3;&Atilde;&aelig;&lt;/a&gt;<br/>&lt;a href=&quot;demo.php?m=index&amp;a=index&amp;name=中文&quot;&gt;测试页面&lt;/a&gt;<br/>

语法:

string htmlentities ( string string [, int quote_style [, string charset]] )

string Required. Specifies the string to convert
必要参数。指定需要解码的字符串对象
quotestyle Optional. Specifies how to encode single and double quotes.
可选参数。定义如何对单引号和双引号进行编码。

The available quote styles are:
可能值:

ENT_COMPAT – Default. Encodes only double quotes
ENT_COMPAT –对双引号进行编码,不对单引号进行编码
ENT_QUOTES – Encodes double and single quotes
ENT_QUOTES –对单引号和双引号进行编码
ENT_NOQUOTES – Does not encode any quotes
ENT_NOQUOTES –不对单引号或双引号进行编码

character-set Optional. A string that specifies which character-set to use.
可选参数。指定使用什么样的字符串设置

Allowed values are:
可用值如下:

ISO-8859-1 – Default. Western European
ISO-8859-1 –默认值。西欧文
ISO-8859-15 – Western European (adds the Euro sign + French and Finnish letters missing in ISO-8859-1)
ISO-8859-15 –西欧文(加入了ISO-8859-1中没有的符号+法语和芬兰字母)
UTF-8 – ASCII compatible multi-byte 8-bit Unicode
UTF-8 – 与ASCII兼容的多字节8位统一的字符编码标准
cp866 – DOS-specific Cyrillic charset
cp866 – DOS – 详细的西尔里[Cyrillic]字符设置
cp1251 – Windows-specific Cyrillic charset
cp1251 – Windows-详细的西尔里[Cyrillic]字符设置
cp1252 – Windows specific charset for Western European
cp1252 – Windws – 详细的西欧字体的字体属性
KOI8-R – Russian
KOI8-R – 俄罗斯文
BIG5 – Traditional Chinese, mainly used in Taiwan
BIG5 – 繁体中文,主要在台湾使用
GB2312 – Simplified Chinese, national standard character set
GB2312 –简体中文,主要在中国大陆使用
BIG5-HKSCS – Big5 with Hong Kong extensions
BIG5-HKSCS – 在香港使用的Big5扩展
Shift_JIS – Japanese
Shift_JIS –日文
EUC-JP – Japanese
EUC-JP –日文

string htmlspecialchars ( string string [, int quote_style [, string charset]] )

The translations performed are:

‘&’ (ampersand) becomes ‘&amp;’
‘”‘ (double quote) becomes ‘&quot;’ when ENT_NOQUOTES is not set.
”’ (single quote) becomes ‘&#039;’ only when ENT_QUOTES is set.
‘<’ (less than) becomes ‘&lt;’
‘>’ (greater than) becomes ‘&gt;’

其中的quote_style及charset设置和上面的差不多