日期:2014-05-18  浏览次数:20873 次

如何用正则表达式取出内容
<a onmousedown="return c({'fm':'as','F':'779717EA','F1':'9D73F1E4','F2':'4CA6DF6A','F3':'54E5243F','T':'1258959999','title':this.innerHTML,'url':this.href,'p1':8,'y':'9BBE5E3E'})"
 href="http://www.baodu.zw78.com/" target="_blank" >
 <font size="3"><font color="#c60a00">八度</font>365导航</font>
有上内容,经处理后,我想得到如下内容
每个href的内容和对应的font值
如上面的信息处理后为
1:http://www.baodu.zw78.com/
2:八度365导航
拜托了,帮帮我吧


------解决方案--------------------
C# code

Regex re = new Regex("(?s)<a.*?(?:href=\")(?<url>[^\"]*)(.*?(<font[^>]*>)){2}(?<name1>.*?(?=</font>))</font>(?<name2>.*?(?=</font))");
                string strContent = "<a  onmousedown=\"return c({'fm':'as','F':'779717EA','F1':'9D73F1E4','F2':'4CA6DF6A','F3':'54E5243F','T':'1258959999','title':this.innerHTML,'url':this.href,'p1':8,'y':'9BBE5E3E'})\" "+
"href=\"http://www.baodu.zw78.com/\"  target=\"_blank\" > "+
"<font size=\"3\"> <font color=\"#c60a00\">八度 </font>365导航 </font> ";

                string strUrl = "";
                string strName = "";
                foreach (Match m in re.Matches(strContent))
                {
                    strUrl = m.Groups["url"].Value.Trim();
                    strName = m.Groups["name1"].Value.Trim() + m.Groups["name2"].Value.Trim(); ; 
                }