日期:2014-05-19  浏览次数:20974 次

菜鸟请教:C#如何打开一个网页,并取得网页的文本、href链接、图片的src?
请教!

------解决方案--------------------
using System;
using System.Net;
using System.Text;
using System.IO;


public class Test
{
// Specify the URL to receive the request.
public static void Main (string[] args)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create (args[0]);

// Set some reasonable limits on resources used by this request
request.MaximumAutomaticRedirections = 4;
request.MaximumResponseHeadersLength = 4;
// Set credentials to use for this request.
request.Credentials = CredentialCache.DefaultCredentials;
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();

Console.WriteLine ( "Content length is {0} ", response.ContentLength);
Console.WriteLine ( "Content type is {0} ", response.ContentType);

// Get the stream associated with the response.
Stream receiveStream = response.GetResponseStream ();

// Pipes the stream to a higher level stream reader with the required encoding format.
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

Console.WriteLine ( "Response stream received. ");
Console.WriteLine (readStream.ReadToEnd ());
response.Close ();
readStream.Close ();
}
}

/*
The output from this example will vary depending on the value passed into Main
but will be similar to the following:

Content length is 1542
Content type is text/html; charset=utf-8
Response stream received.
<html>
...
</html>

*/

取出内容后再分析,或者用正则表达式过滤
------解决方案--------------------
要达到你的要求,应该要用js来实现,可以用ajax方法打开你需要的网站页面,赋予某个控件,如Id为Test1的innerHtml属性中去,然后用Test1.getElementsByTagName( "a ");或者Test1.getElementsByTagName( "img ");文本的话就不好判断了