日期:2014-05-20  浏览次数:20946 次

问个jdom读取html的问题(深夜帮忙下吧)
我的目标是读取<tr><td>something<td/><td>something<td/><td>something<td/></tr>
这个是我自己写的
public void readHtml() {
Map<String,Object> htmls = new HashMap<String,Object>();
SAXBuilder sb = new SAXBuilder();
try {
Document doc = sb.build(this.getClass().getClassLoader().getResourceAsStream("TYPE.html"));
Element root = doc.getRootElement();
List list = root.getChildren("tr");
for(int i = 0;i<list.size();i++){
Element element=(Element)list.get(i);
String name = element.getChildText("td");
System.out.println(name);
}
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
HtmlDaoImpl hdm = new HtmlDaoImpl();
hdm.readHtml();
}
}请问我改怎么用jdom获取tr下面三个td节点的内容呢.我总是只获取第一个获取不到2和第三个我很少用到jdom所以基本不懂...看我熬夜辛苦帮忙下吧各位写个demo看看....

------解决方案--------------------
使用jsoup更方便。http://jsoup.org/
http://jsoup.org/cookbook/extracting-data/selector-syntax

URL url = 
Elements tds = Jsoup.parse(url,1000).select("td");
for(Element td : tds){ String text = td.text();}
------解决方案--------------------
Java code

首先说明LZ贴出来的td结束标签是错的。。。<td/>改为</td>,不然读取根节点就报错。。。

改为:即可

    public static void readXML3() {

        // Map<String, Object> htmls = new HashMap<String, Object>();
        SAXBuilder sb = new SAXBuilder();
        try {
            Document doc = sb.build(new FileInputStream("f:/test.xml")); // 改为LZ自己的读路径方式
            Element root = doc.getRootElement();
            List list = root.getChildren("tr"); // 取tr元素
            for (int i = 0; i < list.size(); i++) {
                Element element = (Element) list.get(i);                    List list_td = element.getChildren();  // 取td元素
                for (int k = 0; k < list_td.size(); k++) {
                    Element element_td = (Element) list_td.get(k);
                    String name = element_td.getText(); // 取td文本值
                    System.out.println(name);
                }                
            }
        } catch (JDOMException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

------解决方案--------------------
Java code

与上面一样。。只是调整格式 方便看:

     public static void readXML3() {

            // Map<String, Object> htmls = new HashMap<String, Object>();
            SAXBuilder sb = new SAXBuilder();
            try {
                Document doc = sb.build(new FileInputStream("f:/test.xml")); // 改为LZ自己的读路径方式
                Element root = doc.getRootElement();
                List list = root.getChildren("tr"); // 取tr元素
                for (int i = 0; i < list.size(); i++) {
                    Element element = (Element) list.get(i);                    
                    List list_td = element.getChildren();  // 取td元素
                    for (int k = 0; k < list_td.size(); k++) {
                        Element element_td = (Element) list_td.get(k);
                        String name = element_td.getText(); // 取td文本值
                        System.out.println(name);
                    }                
                }
            } catch (JDOMException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }