日期:2014-05-19  浏览次数:20935 次

请教正则表达式写法
原字符串如下:
<a   href= "/q?s=00010.SS "> 00010.SS </a> </b> </td> <td   class= "yfnc_tabledata1 "   nowrap   align= "center "> 7月16日 </td> <td   class= "yfnc_tabledata1 "   nowrap   align= "right "> <b> 5.81 </b> </td> <td   class= "yfnc_tabledata1 "   nowrap   align= "right "> <img   width= "10 "   height= "14 "   border= "0 "   src= "http://cn.yimg.com/i/cn/fi/03rd/down_g.gif "   alt= "173 ">   <b   style= "color:#008800; "> 0.25 </b> </td> <td   class= "yfnc_tabledata1 "   nowrap   align= "right "> <b   style= "color:#008800; "> -4.13% </b> </td> <td   class= "yfnc_tabledata1 "   nowrap   align= "right "> 29,215,460 </td>


需要匹配出:
00010(或者00010.SS)
7月16日
5.81
http://cn.yimg.com/i/cn/fi/03rd/down_g.gif
0.25
-4.13%
29,215,460

如果不能一条正则表达式匹配出来,也可以分开一条正则匹配一项,谢谢!


------解决方案--------------------
Regex re = new Regex(@ "(? <= <(\S+)\s*\S*> )[^> <]*(?= </(\1)> ) ", RegexOptions.None);
MatchCollection mc = re.Matches( "text ");
foreach (Match ma in mc)
{
}

这样能匹配出部分
00010.SS
5.81
0.25
-4.13%
------解决方案--------------------
(> [^ <]+ <)
------解决方案--------------------
Regex re = new Regex(@ "\> ([^\ <]{1,})\ <|(http\:\/\/[\/\w.]*) ");
string text = " <a href=\ "/q?s=00010.SS\ "> 00010.SS </a> </b> </td> <td class=\ "yfnc_tabledata1\ " nowrap align=\ "center\ "> 7月16日 </td> <td class=\ "yfnc_tabledata1\ " nowrap align=\ "right\ "> <b> 5.81 </b> </td> <td class=\ "yfnc_tabledata1\ " nowrap align=\ "right\ "> <img width=\ "10\ " height=\ "14\ " border=\ "0\ " src=\ "http://cn.yimg.com/i/cn/fi/03rd/down_g.gif\ " alt=\ "173\ "> <b style=\ "color:#008800;\ "> 0.25 </b> </td> <td class=\ "yfnc_tabledata1\ " nowrap align=\ "right\ "> <b style=\ "color:#008800;\ "> -4.13% </b> </td> <td class=\ "yfnc_tabledata1\ " nowrap align=\ "right\ "> 29,215,460 </td> ";
MatchCollection mc = re.Matches(text);
foreach (Match ma in mc)
{
Label1.Text += ma.Groups[1].Value + " <br /> ";
}
------解决方案--------------------
格式固定吗,除了要提取的部分是变的,还有没有变的,这样只有一个实例,没有说明,写出来的正则可能就只适合这一个实例,未必通用的

string yourStr = ..............;
Match m = Regex.Match(yourStr, @ " <a[^> ]*> ([^ <]*) </a> </b> </td> <td[^> ]*> ([^ <]*) </td> <td[^> ]*> <b> ([^> ]*) </b> </td> <td[^> ]*> <img[^> ]*src= " "([^ " "]*) " "[^> ]*> \s* <b[^> ]*> ([^ <]*) </b> </td> <td[^> ]*> <b[^> ]*> ([^ <]*) </b> </td> <td[^> ]*> ([^ <]*) </td> ", RegexOptions.IgnoreCase);
if (m.Success)
{
MessageBox.Show(m.Groups[1].Value);
MessageBox.Show(m.Groups[2].Value);
MessageBox.Show(m.Groups[3].Value);