======编码问题,我要死了,散分救命!======
遇到几个问题,先请大家帮我解决一个,其它的我自己再琢磨.
1.我要抓取的网页是utf-8的编码格式的,我要抓下来然后生成gb2312编码的网页.我的JAVA程序跑在linux的服务器上,服务器的默认编码字符集是latin1.
------解决方案--------------------假设 InputStream is 为抓取网页得到的输入流
这样:
BufferedReader r =new BufferedReader(new InputStreamReader(is, "utf-8 "));
OutputStream os =new FileOutputStream( "gb.html ");
BufferedWriter w =new BufferedWriter(new OutputStreamWriter(os, "gb2312 "));
String line;
while((line=r.readLine())!=null){
w.append(line);
w.newLine();
}
r.close();
is.close();
w.close();
os.close();