日期:2014-05-18  浏览次数:20752 次

根据网址下载网页内容,出现乱码。
C# code

WebClient client = new WebClient();
client.Encoding = Encoding.GetEncoding("gb2312");

string url = string.Empty;
url = "http://www.whjyj.gov.cn/html/news.asp?ClassId=22&ClassName=教育新闻";

Response.Write(client.DownloadString(url));


输出下载的页面内容中,页面中只有地址参数中的教育新闻都变成乱码了。别的都没问题。

------解决方案--------------------
看看utf-8编码 是不是能解决
------解决方案--------------------
不知道怎么解决,他的文件是不是心中文命名了!
我试过多种编码,但就是教育新闻编码不过来,这几个是UTF-8的转码格式!
------解决方案--------------------
应该UrlEncode一下。

url = HttpServerUtility.UrlEncode(url);

注意引用
Namespace: System.Web
Assembly: System.Web (in System.Web.dll)

参考:
http://msdn.microsoft.com/en-us/library/zttxte6w.aspx
------解决方案--------------------
string formUrl = "http://www.whjyj.gov.cn/html/news.asp?ClassId=22&ClassName=" + HttpUtility.UrlEncode("教育新闻",Encoding.GetEncoding("GB2312"));

HttpWebRequest request = HttpWebRequest.Create(formUrl) as HttpWebRequest;
Encoding myEncoding = Encoding.GetEncoding("gb2312");
request.AllowAutoRedirect = true;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)";
string srcString;
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
System.IO.StreamReader reader = new System.IO.StreamReader(response.GetResponseStream(), Encoding.GetEncoding("GB2312"));
srcString = reader.ReadToEnd();
reader.Close(); ;
Response.Write( srcString);