日期:2014-05-18  浏览次数:20795 次

求助,C#正则表达式没得到想要的结果,希望得到解答
页面内容:
<body>
<font color="#008000">thu63.com/view/1047.htm 2011-8-25</font>
<span class="g">www.seo.com/bbs/ 2011-11-25 </span>
<span class="g"><b>seo</b>.chinaz.com/ 2011-12-7 </span>
<span class="g">www.seowhy.com/bbs/ 2011-11-25 </span>
<span class="g">www.seozac.com/ 2011-12-5 </span>
<span class="g">baiduseoguide.com/ 2011-12-11 </span>
<font color=#008000>jualey.com/seo 2011-12-16 </font>
<span class="g">www.<b>seo</b>bbs.net/ 2011-12-15 </span>
</body>

我用的正则表达式是:  

Console.WriteLine("输入一个网址:");
string myUrl = Console.ReadLine();
Console.WriteLine("正在提取超链接,请稍侯...");
string strRegex = "(?<=<span class=\"g\">).*?(?=/)"; //我使用的正则表达式.
  MatchCollection mc = Regex.Matches(strCode, strRegex);
  foreach (Match m in mc)
  {
  sw.Write("{0}\r\n", m.Value);
  }



失败的结果是:  
www.seo.com
<b>seo<
www.seowhy.com
www.seozac.com
baiduseoguide.com
www.<b>seo<
www.dunsh.org
www.<b>seo<

而我想得到的结果是这样:
www.seo.com
seo.chinaz.com
www.seowhy.com
www.seozac.com
baiduseoguide.com
..
..

怎么解写这句正则表达式 string strRegex = "(?<=<span class=\"g\">).*?(?=/)"; 让它过滤掉里面的<b>和</b>



------解决方案--------------------
Console.WriteLine("输入一个网址:");
string myUrl = Console.ReadLine();
Console.WriteLine("正在提取超链接,请稍侯...");
string strRegex = "(?<=<span class=\"g\">).*?(?=(?<!<)/)"; //我使用的正则表达式.
MatchCollection mc = Regex.Matches(strCode, strRegex);
foreach (Match m in mc)
{
sw.Write("{0}\r\n", Regex.Replace(m.Value,"</?b>","");
}