日期:2014-05-17  浏览次数:20450 次

php读取xml文档遇到实体符号时出错
如<name>aa&nbsp;bb</name> 
中间有这这格的实体符号时
会报出这种警告
( ! ) Warning: DOMDocument::load() [domdocument.load]: Entity 'nbsp' not defined in file:

读取xml文档是这样子
$xml = new DOMDocument("1.0","UTF-8");
$xml->load('myxml.xml');

------解决方案--------------------
能自己发现问题就好,既然不允许出现,你准备怎么解决呢?
------解决方案--------------------
FROM PHP Manual
PHP code

When using loadXML() to parse a string that contains entity references (e.g., &nbsp;), be sure that those entity references are properly declared through the use of a DOCTYPE declaration; otherwise, loadXML() will not be able to interpret the string.

Example:
<?php
$str = <<<XML
<?xml version="1.0" encoding="iso-8859-1"?>
<div>This&nbsp;is a non-breaking space.</div>
XML;

$dd1 = new DOMDocument();
$dd1->loadXML($str);

echo $dd1->saveXML();
?>

Given the above code, PHP will issue a Warning about the entity 'nbsp' not being properly declared.  Also, the call to saveXML() will return nothing but a trimmed-down version of the original processing instruction...everything else is gone, and all because of the undeclared entity.

Instead, explicitly declare the entity first:
<?php
$str = <<<XML
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE root [
<!ENTITY nbsp "&#160;">
]>
<div>This&nbsp;is a non-breaking space.</div>
XML;

$dd2 = new DOMDocument();
$dd2->loadXML($str);

echo $dd2->saveXML();
?>

Since the 'nbsp' entity is defined in the DOCTYPE, PHP no longer issues that Warning; the string is now well-formed, and loadXML() understands it perfectly.

You can also use references to external DTDs in the same way (e.g., <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">), which is particularly important if you need to do this for many different documents with many different possible entities.

Also, as a sidenote...entity references created by createEntityReference() do not need this kind of explicit declaration.

------解决方案--------------------
恭喜lz。~~
thanks ZT_king
------解决方案--------------------
< > 和 & 这三个不能出现