日期:2014-05-20  浏览次数:20867 次

关于dom4j解析编码的问题,org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence
public class Test2 {

private static final String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><PCIC-WD><RESULT-INFO WEBSVRRESULT=\"SUCCESS\" COMMENT=\"可以正常核销!\" KEEP=\"\"/></PCIC-WD>";


public static void main(String[] args) throws DocumentException {

InputStream inputStream = new ByteArrayInputStream(xml.getBytes());
SAXReader saxReader = new SAXReader();
saxReader.read(inputStream).getRootElement();

}

}

上面是一段xml的字符串,应为在系统中是webservice返回的,为了方便就直接在程序里写了,运行就抛下面的异常,我想请问一下这是为什么,有什么解决的办法.编码的问题一直是我的搞不懂的地方.

Exception in thread "main" org.dom4j.DocumentException: Error on line 1 of document : Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)
at org.dom4j.io.SAXReader.read(SAXReader.java:343)
at com.person.test.Test2.main(Test2.java:18)
Nested exception: 
org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1810)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:343)
at com.person.test.Test2.main(Test2.java:18)


期待您的帮助,谢谢!




------解决方案--------------------
UTF-8编码中中文解析有问题 
将编码格式改成“GB2312”后就可以正常解析了。<?xml version="1.0" encoding="GB2312"?>