UTF-8存储中文的问题~~乱码的问题
DOM4J写文件出现乱码,重新编码也不行,写了个测试的 代码如下
import java.io.*;
import org.dom4j.*;
import org.dom4j.io.*;
public class Test {
public static void main(String[] args)
{
try{
SAXReader reader = new SAXReader();
Document document = reader.read( "c:/Demo.txt ");
Element root = document.getRootElement();
String a= "研发部 ";
Element newElement=root.addElement( "Department ")
.addAttribute( "value ",new String(a.getBytes(), "UTF-8 "))
;
// OutputFormat format=new OutputFormat( " ",true, "GBK ");
//使用 format能解决问题但是,XML规定为UTF-8
XMLWriter writer = new XMLWriter(
new FileOutputStream(new File( "c:/Demo.txt ")));
//new FileWriter( "c:/Demo.txt "));
//使用FileWriter并不正确,DOM4J并未转码,写第二次时会报错
writer.write( document );
writer.close();
}catch(Exception e){System.out.println(e.getMessage());}
}
}
Demo.txt
<?xml version= "1.0 " encoding= "UTF-8 "?>
<Company>
</Company>
------解决方案--------------------up
------解决方案--------------------上面的写法其实等于没写
我只想告诉你new String(a.getBytes(), "UTF-8 ")))导致了乱码
你首先得到汉字的unicode编码,然后将这编码当作utf-8,再次变回unicode。肯定是不行的
其实 <?xml version= "1.0 " encoding= "UTF-8 "?> 并非是xml本身的编码,
而是告诉让其他软件采用哪一种编码来解读
所以本地操作xml文件不需要太操心