- 爱易网页
-
PHP教程
- PHP小偷 关于抓取页面不同编码的解决方法!
日期:2012-05-15 浏览次数:20480 次
- <?
- function get_sub_content($str, $start, $end){
- if ( $start == '' $end == '' ){
- return "页面元素已经改变!";
- }
- $str = explode($start, $str);
- $str = explode($end, $str[1]);
- return $str[0];
- }
-
- function my_encoding($data,$to){
- $encode_arr = array('UTF-8','ASCII','GBK','GB2312','BIG5','JIS','eucjp-win','sjis-win','EUC-JP');
- $encoded = mb_detect_encoding($data, $encode_arr);
- $data = mb_convert_encoding($data,$to,$encoded);
- return $data;
- }
- $doc = file_get_contents("http://video.baidu.com/v?ct=0&word=周杰伦%20site%3Awww%2Etudou%2Ecom&db=0&ty=0&rn=20&pn=0&fbl=1024");
- $doc = my_encoding($doc,"utf-8");
- $doc =get_sub_content($doc,"<div id=\"result\">","<br clear=");
- $str_replace = explode("<div class=x>",$doc);
- echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
- echo "<data>";
- for ($i=1; $i<=count($str_replace)-1; $i++){
- echo "<video>";
- echo "<name>";
- echo "<![CDATA[".get_sub_content($str_replace[$i],"title=\"","\"")."]]>";
- echo "</name>";
- echo "<pageurl>";
- echo "<![CDATA[".get_sub_content($str_replace[$i],"<a href=\"","\" onmousedown=")."]]>";
- echo "</pageurl>";
- echo "</video>";
- }
- echo "</data>";
- ?>
免责声明: 本文仅代表作者个人观点,与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。