日期:2014-05-18  浏览次数:20989 次

[求解]Unicode字符“\uabcd”格式转换成汉字要先倒过来?
C# code

public string Unicode2String(string SourceString)
        {
            if (SourceString.Contains("\\u"))
            {
                string r = "";
                try
                {

                    string[] arr = SourceString.Split('\\');
                    foreach (string one in arr)
                    {
                        if (Regex.IsMatch(one, @"u[0-9a-f]{4}"))
                        {
                            byte[] b = new byte[2];
                            b[0]=(byte)int.Parse(one.Substring(1,2),System.Globalization.NumberStyles.HexNumber);
                            b[1] = (byte)int.Parse(one.Substring(2),System.Globalization.NumberStyles.HexNumber);
                            //MessageBox.Show(b[0].ToString("x2") + " " + b[1].ToString("x2"));
                            //byte[] b2 = new byte[2];
                            //b2 = Encoding.Unicode.GetBytes("消");
                            //MessageBox.Show(b2[0].ToString("x2")+" "+b2[1].ToString("x2"));
                            byte b1 = b[0];
                            byte b2 = b[1];
                            b[0] = b2;
                            b[1] = b1;//这里要把两个byte倒过来,否则就乱码。。。
                            r += Encoding.Unicode.GetString(b);
                        }
                        else
                        {
                            
                            r += "\\"+one;
                        }
                    }
                    r = r.Substring(1);//把前面多余的“\”去掉
                    return r;
                    
                }
                catch
                {
                    
                    return "";
                }



------解决方案--------------------
一般电脑是小端(Little Endian),UTF8是大端(Big Endian),所以你要交换下高低字节。。
参考这两篇文章:
http://www.52rd.com/Blog/Detail_RD.Blog_imjacob_14837.html
http://www.cnblogs.com/TsuiLei/archive/2008/10/29/1322504.html