有关String类replaceAll方法的问题
我想把一个text文件转成xml文件。
比如text里写
title:aaa
item:
date:bbb
item:
date:ccc
我想把它转成
<title> aaa </title>
<item>
<date> bbb </date>
</item>
<item>
<date> ccc </date>
</item>
用replaceAll怎么做呢?
------解决方案--------------------我也贴一个:
丑陋代码一则...
public class TestDy{
public static void main(String[] args) {
TestDy tf=new TestDy();
String str=tf.mread( "E:/test.txt ");
str=str.replaceAll( "title:(.+) ", " <title> $1 </title> ");
str=str.replaceAll( "date:(.+) ", " <date> $1 </date> \n </item> ");
str=str.replaceAll( "item: ", " <item> ");
System.out.println(str);
}
//读文件
public String read(String path){
InputStreamReader in;
char []ch=new char[1024];
StringBuffer cb=new StringBuffer();
try {
in=new InputStreamReader(new FileInputStream(path), "UTF8 ");
int len=0;
while((len=in.read(ch))!=-1){
cb.append(ch,0,len);
}
} catch (Exception e) {
e.printStackTrace();
}
return cb.toString();
}
}
//----------------结果如下:
<title> aaa </title>
<item>
<date> bbb </date>
</item>
<item>
<date> ccc </date>
</item>
------解决方案--------------------受sharelist()启发~
public class Test60 {
public static void main(String[] args) {
String s = "title:aaa\n " +
"item:\n " +
"date:bbb\n " +
"date:ddd\n " +
"item:\n " +
"date:ccc\n " +
"date:ddd\n " +
"item:\n " +
"date:ccc ";
String s1 = s.replaceAll( "title:(.+) ", " <title> $1 </title> ");
s1 = s1.replaceAll( "date:(.+) ", " <date> $1 </date> ");
s1 = s1.replaceAll( "(item):((?s).+?)((?=[^.]\\1)|\\z) ", " <$1> $2\n </$1> ");
System.out.println(s1);
}
}
------解决方案-------------------- String s = "title:aaa\n " +
"item:\n " +
"date:aaa\n "+
"item:\n " +
"date:bbb\n "+
"date:bbbBB\n "+
"date:bbbBBBBB\n "+
"item:\n "+
"date:ccc\n " +
"date:cccCCC\n ";
System.out.println(
s.replaceAll(
"(.+):\n((.|\n)+?)((?=(\\1:\n))|\\z) ",
" <$1> \n$2 </$1> \n ").
replaceAll(
"(.+):((.+)|())\n ",
" <$1> $2 </$1> \n ")
);
------解决方案-------------------- "(.+):\n((.|\n)+?)((?=(\\1:\n))|\\z) "
(.+) 匹配 xxx: 然后马上接回车的部分,也就是item:\n
((.|\n)+?) 匹配后续的部分,并且使用非贪婪匹配(也就是尽可能的少匹配)
((?=(\\1:\n))|\\z)
其中\\z表示匹配字符串结束,
\\1表示匹配此pattern的第一组,也就是(.+)所匹配到的字符串,也就是匹配item:\n
前者中?=表示向前查找但并不包含匹配,也就是说找到下一个item:\n的位置,但是后面这段并不包含在查找的字符中中(真拗口,不太会解释)
<$1> \n$2 </$1> \n