日期:2014-05-19  浏览次数:20944 次

如何用webBrowser获取网页中中文字符串
网页代码如下:
<!DOCTYPE   html   PUBLIC   "-//W3C//DTD   XHTML   1.0   Transitional//EN "   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd ">

<html   xmlns= "http://www.w3.org/1999/xhtml "   >
<head> <title>

</title> </head>
<body>
        <form   name= "form1 "   method= "post "   action= "gb_reg.aspx?xlh=111111&amp;zcm=222222 "   id= "form1 ">
<div>
<input   type= "hidden "   name= "__VIEWSTATE "   id= "__VIEWSTATE "   value= "/wEPDwUJNTQ4NjgwNjk5D2QWAgIDD2QWBAIBDw8WAh4EVGV4dAUM5rOo5YaM5oiQ5YqfZGQCAw8PZA8QFgJmAgEWAhYCHg5QYXJhbWV0ZXJWYWx1ZQUGMTExMTExFgIfAQUGMjIyMjIyFgICBAIEZGRk4k7rGkjlLQXnKEgpmBgwRWtZTZo= "   />
</div>

        <div>
                &nbsp;
                <span   id= "Label1 "> 注册成功 </span>
       
        </div>
               
        </form>
</body>
</html>

,------------------------------
我只想要里面的那几个中文字

------解决方案--------------------
如果只有一处出现中文,这样

string yourStr = ............;
Match m = Regex.Match(yourStr, @ "[\u4e00-\u9fa5]+ ");
if (m.Success)
{
string resultStr = m.Value;
}

如果有多处出现中文,这样
MatchCollection mc = Regex.Matches(yourStr, @ "[\u4e00-\u9fa5]+ ");
foreach (Match m in mc)
{
richTextBox2.Text += m.Value + "\n ";
}