怎么解析html-Java教程-爱易网页

怎么解析html

日期：2014-05-20　浏览次数：20931 次

如何解析html
html内容如下：

GET www.baidu.com/pub/WWW/ HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash,

application/vnd.ms-powerpoint, application/vnd.ms-excel, application/msword, */*
Accept-Language: zh-cn
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: www.baidu.com
Connection: Keep-Alive
Referer: http://www.ntop.org/
Pragma:no-cache
Content-length:244
<html>
<head>
<title> xinxi </title>
<result> </result>
</head>
<body>
<accNo> 12312 </accNO>
</body>
</html>

------解决方案--------------------
文法分析，好像课本里的东西吧
------解决方案--------------------
import au.id.jericho.lib.html.*;
import java.util.*;
import java.io.*;
import java.net.*;

public class DisplayAllElements {
public static void main(String[] args) throws Exception {
String sourceUrlString= "c:\\test.html ";
if (sourceUrlString.indexOf( ': ')==-1) sourceUrlString= "file: "+sourceUrlString;
URL sourceUrl=new URL(sourceUrlString);
String htmlText=Util.getString(new InputStreamReader(sourceUrl.openStream()));
Source source=new Source(htmlText);
source.setLogWriter(new OutputStreamWriter(System.err));
// send log messages to stderr
for (Iterator i=source.findAllElements().iterator(); i.hasNext(); )

{
Element element=(Element)i.next();
System.out.println(element.getDebugInfo());
System.out.println(element);
}
}
}

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

怎么解析html

相关资料更多>

推荐阅读更多>