C#抓取HTML页面~该怎么解决-C#教程-爱易网页

C#抓取HTML页面~该怎么解决

日期：2014-05-17　浏览次数：21060 次

C#抓取HTML页面~~
请问下各位大神，用C#如何通过一个地址抓取页面内容以及样式，保存为一个静态页面，例如抓这个页面http://www.baidu.com/s?wd=%E6%B5%B7%E6%B4%8B&rsv_spt=1&issp=1&rsv_bp=0&ie=utf-8&tn=baiduhome_pg&rsv_n=2&rsv_sug3=1&rsv_sug=0&rsv_sug1=1&rsv_sug4=62 原封不动的抓取下来，保存成一个静态页面，分有点少请见谅

C# asp.net

------解决方案--------------------
用流的方式试试，我不知道是不是哦这个原因

WebRequest request = WebRequest.Create ("http://www.contoso.com/default.html");
            // If required by the server, set the credentials.
            request.Credentials = CredentialCache.DefaultCredentials;
            // Get the response.
            HttpWebResponse response = (HttpWebResponse)request.GetResponse ();
            // Display the status.
            Console.WriteLine (response.StatusDescription);
            // Get the stream containing content returned by the server.
            Stream dataStream = response.GetResponseStream ();
            // Open the stream using a StreamReader for easy access.
            StreamReader reader = new StreamReader (dataStream);
            // Read the content.
            string responseFromServer = reader.ReadToEnd ();
            // Display the content.
            Console.WriteLine (responseFromServer);
            // Cleanup the streams and the response.
            reader.Close ();
            dataStream.Close ();
            response.Close ();
------解决方案--------------------
利用 WebClient类和 WebRequest类，我们可以很容易地得到给定URL地址的源代码

参考http://www.cnblogs.com/ganmk/articles/1213315.html

如果是要解析源码可以用HTMLParser组件
参考http://www.cnblogs.com/loveyakamoz/archive/2011/07/27/2118937.html
------解决方案--------------------
使用WebClient类和 WebRequest类，抓取的是静态的页面，不包括js渲染的部分。

用WebBrowser可以抓出ajax加载的东西。

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

C#抓取HTML页面~该怎么解决

相关资料更多>

推荐阅读更多>