工程错误?(c#)
using System;
using System.Net;
using System.IO;
using System.Xml;
using System.Data;
using System.Text;
using System.Data.SqlClient;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace get_xiaofeishuma_brand
{
class Program
{
static void Main(string[] args)
{
WebClient mywebclient = new WebClient();
mywebclient.Credentials = CredentialCache.DefaultCredentials;
byte[] mybyte = mywebclient.DownloadData( "http://detail.zol.com.cn/category/15.html ");
string yourstr = Encoding.Default.GetString(mybyte);
MatchCollection mc = Regex.Matches(yourstr, @ " <div\sclass= " "manu_elem " "> \s.*? <div\sclass= " "manu_photo " "> [\s\S].*? <a\s.*?href= " "(? <i_url> [a-zA-Z0-9_:/.]{1,})[a-zA-Z0-9_:/.].*?> [\s].*? </div> ");
foreach (Match m in mc)
{
Console.WriteLine(m.Groups[ "i_url "].Value.ToString());
}
Console.ReadLine();
}
}
}
运行后什么也没有。
是不是正则表达式错误?
可是我在MTracer运行时都抓过来的啊
------解决方案--------------------正则书写有问题,改了一下
MatchCollection mc = Regex.Matches(yourstr, @ " <div\sclass= " "manu_elem " "> \s* <div\sclass= " "manu_photo " "> [\s\S]*? <a\s*href= " "(? <i_url> [^ " "]*) " "[^> ]*> [\s\S]*? </div> ");
MTracer与vs.net使用的正则引擎不完全一样,有的时候MTracer可以匹配到结果,但vs.net里未必就得到一样的结果,尤其是涉及到贪婪与非贪婪匹配的时候,尽量用不会产生歧义的正则写法
\s.*?
[\s\S].*?
[a-zA-Z0-9_:/.]{1,})[a-zA-Z0-9_:/.].*?
[\s].*?
以上几处都需要注意一下,不要这样写
比如[\s\S].*?这个,[\s\S]匹配任意字符,但只匹配多个.*?匹配任意一个非“\n”字符,这样写最终除[\s\S]可能匹配一个“\n”外,最多匹配的还是一行
------解决方案--------------------UP~UP~!