日期:2014-05-17 浏览次数:20847 次
帮助文档:http://htmlunit.sourceforge.net/
依赖jar包很文档:http://sourceforge.net/projects/htmlunit/files/
举例:解析一个html中的图片地址,图片地址由js赋值。
public void getElements() throws Exception {
??????? final WebClient webClient = new WebClient();
??????? final HtmlPage page =
???????????????webClient.getPage("http://localhost:8080/jsTest");
??????? final HtmlDivision div = page.getHtmlElementById("modePhoto");
??????? DomNodeList<DomNode> dnl=div.getChildNodes();
??????? System.out.println(dnl.get(0).getAttributes().getNamedItem("src").getTextContent());
??? }
需要抓取的html:
获取一个特定的浏览器版本:
?public void homePage_INTERNET_EXPLORER() throws Exception {
???? final WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_8);
???? final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");
//???? assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText());
?}
------待续