日期:2014-05-17  浏览次数:20548 次

求一个正则 抓起指定DIV下的H5标签里的内容
样式如下:
HTML code

        <div class="dailyEcont" id="dailyEcont">
                        <img class="fl dailyEimg" alt="" src="attachment/2012-05-23.jpg">
            <div class="fl dailyEtext">
                <h5 class="e">this test.                                <a class="laba" href="javascript:return false;" title="123">123</a>
                                </h5>
                <h5>测试 </h5>
              ....
              ....
            </div>
        </div>




要得到id为dailyEcont下的 this test. 和 测试

------解决方案--------------------
C# code

            string str = @"<div class=""dailyEcont"" id=""dailyEcont"">
                        <img class=""fl dailyEimg"" alt="""" src=""attachment/2012-05-23.jpg"">
            <div class=""fl dailyEtext"">
                <h5 class=""e"">this test.                                <a class=""laba"" href=""javascript:return false;"" title=""123"">123</a>
                                </h5>
                <h5>测试 </h5>
              ....
              ....
            </div>
        </div>";
            Regex reg = new Regex(@"(?is)<div[^>]*?id=""dailyEcont""[^>]*?>(?:.*?<h5[^>]*?>([^<>]+).*?</h5>)*.*?</div>");
            foreach (Capture c in reg.Match(str).Groups[1].Captures)
                Console.WriteLine(c.Value);
/*
this test.
测试

*/

------解决方案--------------------
C# code

void Main()
{
var html = @"  <div class=""dailyEcont"" id=""dailyEcont"">
                        <img class=""fl dailyEimg"" alt="""" src=""attachment/2012-05-23.jpg"">
            <div class=""fl dailyEtext"">
                <h5 class=""e"">this test.                                <a class=""laba"" href=""javascript:return false;"" title=""123"">123</a>
                                </h5>
                <h5>测试 </h5>
              ....
              ....
            </div>
        </div>


";
      foreach(Match m in Regex.Matches(html,@"(?is)<div\b[^>]*?id=(['""]?)dailyEcont\1>(.*?<h5\b[^>]*?>([^<>]+).*?</h5>.*?)+"))
      {
        foreach(Capture c in m.Groups[3].Captures)
        {
            Console.WriteLine("{0}\t",c.Value.Trim());
        }
      }
      
      /*
    this test.    
   测试
      */
}