日期:2014-05-18  浏览次数:21117 次

为什么有的网站获取源代码会失败
获取源代码的代码:


 
C# code
 
WebClient webclient = new WebClient();
webclient.Proxy = null;
byte[] buf = webclient.DownloadData(url);

return Encoding.Default.GetString(buf);






其他的网站正常,但是有个别像
比如晋江文学网 http://www.jjwxc.net/onebook.php?novelid=443389
比如有妖气 http://u17.com
获取的结果用MessageBox.Show出来就是几个字的乱码

请问这个要怎么解决?

我也换了很多种,但是获得结果都是 “”

------解决方案--------------------
编码问题,你把代码改成如下:
C# code

            WebClient webclient = new WebClient();
            webclient.Proxy = null;
            byte[] buf = webclient.DownloadData("http://u17.com");
            string result= Encoding.GetEncoding("utf-8").GetString(buf);//默认的编码
              //为GB2312,有些网站采用utf8编码规则,默认的编码规则就会遇到中文会出现乱码

------解决方案--------------------
WebClient webclient = new WebClient();
webclient.Proxy = null;
byte[] buf = webclient.DownloadData("http://u17.com/");
System.IO.MemoryStream mem = new MemoryStream(buf);
System.IO.Compression.GZipStream gzip = new System.IO.Compression.GZipStream(mem, System.IO.Compression.CompressionMode.Decompress);
StreamReader reader = new StreamReader(gzip);
return reader.ReadToEnd();