抓取页面的内容
我已经抓取了某个页面,已经把内容局限在以下的这个表里,请问2个问题:
1)如何抓取每对的 "英文名称 "和 "中文名称 "
2)如何抓取PAGE数(page= "xx ")
<TABLE WIDTH= "100% " BORDER= "0 " CELLPADDING= "0 " CELLSPACING= "5 ">
<TR>
<TD VALIGN= "top "> <TABLE WIDTH= "100% " BORDER= "0 " CELLPADDING= "0 " CELLSPACING= "5 ">
<TR>
<TD WIDTH= "4% "> <IMG SRC= "image/retail.gif " WIDTH= "18 " HEIGHT= "14 "> </TD>
<TD WIDTH= "96% "> 英文名称: <span style= "font-family:Verdana, Arial, Helvetica, sans-serif "> Acid c <font color=red> ya </font> nine </span>
</TD>
</TR>
<TR>
<TD> </TD>
<TD> 中文名称:酸性花青[染料] </TD>
&n