怎么读取txt文本中url的内容
txt文本中有几千条url地址
int counter = 0;
string line;
// Read the file and display it line by line.
System.IO.StreamReader file =
new System.IO.StreamReader(@"C:\Users\test\Desktop\contenturllist.txt");
while ((line = file.ReadLine()) != null)
{
System.Console.WriteLine(line);
counter++;
}
file.Close();
System.Console.WriteLine("There were {0} lines.", counter);
// Suspend the screen.
System.Console.ReadLine();
这个代码是读取文本所有的url地址
byte[] getdate = obj.getdate("http://build.0551fangchan.com/2011-12-19/71299-1.html");
singlenewssave temp = new singlenewssave();
temp.init(getdate);
Console.WriteLine(temp.title);
Console.Read();
这个是读取一条url的标题
现在的问题是怎么样读取所有url的标题,要怎么做。。。
byte[] getdate = obj.getdate("http://build.0551fangchan.com/2011-12-19/71299-1.html");
这个要怎么改
------解决方案--------------------
正则通用的链接
HTML code
^((https?|ftp|news):\/\/)?([a-z]([a-z0-9\-]*[\.。])+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel)|(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-z][a-z0-9_]*)?$