求个HTML正则表达式
HTML如下:
<tr> <td width= '20 ' class= 'hei14 '> · </td> <td width= '360 '> <a href=http://news.xinhuanet.com/travel/2007-05/17/content_6108964.htm target= '_blank ' class= 'hei14 '> 武夷山风景名胜区门票价格上调 </a> <span class= 'sj '> (05-17) </span> </td> </tr>
需要获取
1,http://news.xinhuanet.com/travel/2007-05/17/content_6108964.htm
2,武夷山风景名胜区门票价格上调
3,05-17
------解决方案--------------------格式固定吗,楼主应该是要同时取多个吧,这样试下
string yourStr = ...........;
MatchCollection mc = Regex.Matches(yourStr, @ " <tr[^> ]*?> [\s\S]*? <a\s+href=([ " " ']?)(? <url> [^ " " '\s]*)\1?[^> ]*?> (? <text> [^ <]*?) </a> \s* <span[^> ]*?> \((? <time> [^ <\)]*?)\) </span> </td> \s* </tr> ", RegexOptions.IgnoreCase);
foreach (Match m in mc)
{
richTextBox2.Text += m.Groups[ "url "].Value + "\n ";
richTextBox2.Text += m.Groups[ "text "].Value + "\n ";
richTextBox2.Text += m.Groups[ "time "].Value + "\n ";
}