java字符串中截取字符串
求高手给个指教,jsp页面转换为文本了,我要怎么能从这个文本中得到我想要的url
java代码实现啊。。。。
<ul class="linkpanel panel_15" id="group1">
<li><a href='http://v.youku.com/v_show/id_XNjY0MTk3ODY4.html' _log_title='单恋双城' site='youku' _log_type='2' _log_ct='1'
_log_pos=1 _log_directpos='4' _log_sid="272862" _log_cate="97" target='_blank'>1</a></li>
<li><a href='http://v.youku.com/v_show/id_XNjczOTQxMDYw.html' _log_title='单恋双城' site='youku' _log_type='2' _log_ct='1'
_log_pos=1 _log_directpos='4' _log_sid="272862" _log_cate="97" target='_blank'>22</a></li>
</ul>
------解决方案--------------------举个例子
String url = "文本";
WebPageUtil webPageUtil = new WebPageUtil().processUrl(url);
String body = webPageUtil.getWebContent();
String content = body;
content=content.substring(content.indexOf("class=\"weatherTopright\""),content.length());
List<String> reportStrList = WeatherUtil.getAllMathers(content,
">[^<>]*</strong>");
for (int i = 0; i < reportStrList.size(); i++) {
String str = reportStrList.get(i);
str=str.replace("</strong>", "");
str=str.replace(">", "");
reportStrList.set(i, str);
}
------解决方案--------------------用正则表达式,示例:
String regex = "(</thead><tbody><tr>)(.*?)(</tbody>)";
Pattern pa = Pattern.compile(regex);
Matcher ma = pa.matcher(htcontent);
String ss = null;
while(ma.find()){
ss = ma.group(2);
//System.out.println(ss);
}
示例为从htcontent中提取</thead><tbody><tr>于</tbody>中间的内容。提取url相同的思路,具体操作自己网上找找就能搞定。
------解决方案--------------------String x = "href='(.*?)'\\s";
Pattern pattern = Pattern.compile(x);
String s = "<ul class=\"linkpanel panel_15\" id=\"group1\"><li><a href='http://v.youku.com/v_show/id_XNjY0MTk3ODY4.html' _log_title='单恋双城' site='youku' _log_type='2' _log_ct='1'";
s+="_log_pos=1 _log_directpos='4' _log_sid=\"272862\" _log_cate=\"97\" target='_blank'>1</a></li><li><a href='http://v.youku.com/v_show/id_XNjczOTQxMDYw.html' _log_title='单恋双城' site='youku' _log_type='2' _log_ct='1'";
s+="_log_pos=1 _log_directpos='4' _log_sid=\"272862\" _log_cate=\"97\" target='_blank'>22</a></li></ul>";
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1));
}