日期:2014-05-17  浏览次数:21705 次

chunked编码的内容如何解码
C# code

            string url="http://www.google.com.hk/search?q=c";
            string html="";
            WebClient web = new WebClient();
            web.Headers.Add(HttpRequestHeader.Referer, url);
            html = web.DownloadString(url);  //得到的响应数据里有乱码,原因是[color=#FF0000]Transfer-Encoding:chunked[/color],该如何解码?
            web.Dispose();



------解决方案--------------------
是不是用了gzip试下这个 
WebClient web = new WebClient();
Encoding encoding = Encoding.GetEncoding("utf-8");
string result = "";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 30000;
//设置连接超时时间
request.Method = "GET";
request.UserAgent = "Googlebot/2.1 (+http://www.google.com/bot.html)";
request.Headers.Add("Accept-Encoding", "gzip, deflate");

HttpWebResponse response = (HttpWebResponse)request.GetResponse();
using (Stream streamReceive = response.GetResponseStream())
{
using (GZipStream zipStream = new GZipStream(streamReceive, CompressionMode.Decompress))
using (StreamReader sr = new StreamReader(zipStream, encoding))
result = sr.ReadToEnd();
}
------解决方案--------------------
这是我访问到的页面内容:

前面的部分:

<!doctype html><html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><title>c - Google 搜尋</title><style>#gog{background:#fff}#gbar,#guser{font-size:13px;padding-top:1px !important}#gbar{float:left;height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh


后面的部分:

type="text" name="q" maxlength="2048" value="c" title="搜尋"></div></td><td><div class="ds"><div class="lsbb"><input type="submit" name="btnG" class="lsb" value="搜尋"></div></div></td></tr></table><input type="hidden" name="hl" value="zh-TW"></form></div><p id="bfl" class="flc" style="margin:6px 0 0;text-align:center"><a href="/intl/zh-TW/help.html">搜尋說明</a> <a href="/quality_form?q=c&amp;hl=zh-TW&amp;newwindow=1&amp;prmd=ivnsb">請提供您寶貴的意見</a></p></div><div id="fll" class="flc" style="margin:19px auto 19px auto;text-align:center"><a href="/"> Google 首頁 </a> <a href="
/intl/zh-TW/ads/">廣告計劃</a> <a href="/intl/zh-TW/privacy.html">隱私權政策</a> <a href="/intl/zh-TW/about.html">Google 完全手冊</a></div></div></td><td valign="top"></td></tr></table><script src="/extern_js/f/CgV6aC1UVxICaGsrMFo4ACwrMA44ACwrMAo4AEACLCswGDgALIACQpACOQ/9wB50KX8WSo.js"></script><script type="text/javascript">
var form=document.gs;
google.ac.i(form,form.q,'','c','',{l:1,sw:1,d:1,he:'sbhost',vc:-3});</script></body></html>

读取的方法参考:

读取网络资源,返回字节数组

网页采用的是 utf8 编码
------解决方案--------------------
探讨
请问:getBytes(string url,Cooki