日期:2014-05-18  浏览次数:20833 次

正则获取html源码内容
用正则获取html源码相应的内容
例如:
<div class="span-16 white last shipuImg" id="receipts">
<div class="span-4 no-overflow "><a href="/%E6%A9%84%E6%A6%84%E6%B2%B9%E8%92%9C%E6%9C%AB%E6%84%8F%E5%A4%A7%E5%88%A9%E9%9D%A2-74810.htm" rel="ShipuClk"><img src="http://xinshipu.cn/20100521/smallImage2/1274427783697.jpg" alt="橄榄油蒜末意大利面" title="橄榄油蒜末意大利面" tabindex=1/><br/>橄榄油蒜末意大利面</a>
</div>

<div class="span-4 no-overflow "><a href="/%E8%87%AA%E5%88%B6%E6%A9%84%E6%A6%84%E6%B2%B9-146514.htm" rel="ShipuClk"><img src="http://xinshipu.cn/20120319/smallImage2/1332133674898.jpg" alt="自制橄榄油" title="自制橄榄油" tabindex=2/><br/>自制橄榄油</a>
</div>

<div class="span-4 no-overflow "><a href="/%E9%A6%99%E8%92%9C%E6%A9%84%E6%A6%84%E6%B2%B9%E9%9D%A2-39878.htm" rel="ShipuClk"><img src="http://xinshipu.cn/20100415/smallImage2/1271332701109.jpg" alt="香蒜橄榄油面" title="香蒜橄榄油面" tabindex=3/><br/>香蒜橄榄油面</a>
</div>
 
<hr class="space"/></div>
就这么段html源码,这么将含有"spane-4"的div,获取出来
用正则

谢谢!

------解决方案--------------------
要获取div,干吗不用jquery
var lst = $(".spane");
就这么简单。。。用正则匹配多累人啊
------解决方案--------------------
(?is)<div\s*class="span-4 no-overflow "[^>]*>.*?</div>

试试
------解决方案--------------------
C# code

            StreamReader reader = new StreamReader("c:\\1.txt");
            string source = reader.ReadToEnd();
            Regex reg = new Regex(@"(?is)<div[^>].*?span-4[^>].*?</div>");
            MatchCollection mc = reg.Matches(source);
            foreach (Match m in mc)
            {
                MessageBox.Show(m.Value);
            }

------解决方案--------------------
C# code
        MatchCollection matches = Regex.Matches(s, @"(?is)<div class=""span-4[^>]+>.*?</div>");
        foreach (Match match in matches)
            //输出match.Value

------解决方案--------------------
(?is)<div\b[^>]*?class=(['"]?)span-4[^'"]*?\1[^>]*?>.*?</div>
------解决方案--------------------
(?is)<div\b[^>]*?class=(['"]?)span-\d+[^'"]*?\1[^>]*?>.*?</div>