求一段php抓取题目和超链接的代码-PHP教程-爱易网页

求一段php抓取题目和超链接的代码

日期：2014-05-17　浏览次数：20510 次

求一段php抓取标题和超链接的代码
比如说http://xcb.nuist.edu.cn/e/wap/list.php?classid=6&style=0&bclassid=1
页面上的新闻"标题"+"时间"+"超链接"

小弟不胜感激，希望直接可以用，网上的我有点用不了，本人没有php基础，还望理解

如果有朋友使用正则表达式来做的话，
<li><a[^>].+>(.+)<span>(.+)</span></a></li>
这个希望能有点帮助

php 正则表达式抓取数据

------解决方案--------------------

$s=file_get_contents('http://xcb.nuist.edu.cn/e/wap/list.php?classid=6&style=0&bclassid=1');

preg_match_all('/<li><a\s+href="(.+)"[^>]*>(.+)<span>(.+)<\/span><\/a><\/li>/isU',$s,$m);

print_r($m);

------解决方案--------------------
function func_globalscanlink($strUrl, &$arrAhef, &$arrLink, &$arrTitle, &$strLinkAll)
{
$strText = func_ToUtf8(func_ReadPage($strUrl));
$strText = func_WebFillup($strUrl, $strText);
if(!preg_match_all("/(<a[^<>]*href[ ]*=[ ]*\"([^<>]*?)\"[^<>]*>(.*?)<\/a>)/si", $strText, $arr2A_mat))
return 0;

$strLinkAllTem = "";
for($i = 0; $i < count($arr2A_mat[0]); $i++)
{
$strLinkTem = $arr2A_mat[2][$i];
if(strlen($strLinkTem) < 10)
continue;
if(!strpos(" ".$strLinkAllTem, $strLinkTem) && strpos(" ".$strLinkTem, "http://"))
{
$strTitleTem = $arr2A_mat[3][$i];
$strTitleTem = preg_replace("/<.*?.>/si", "", $strTitleTem);
if(strlen($strTitleTem) > 6)
{
$arrAhef[count($arrAhef)] = $arr2A_mat[1][$i];
$arrLink[count($arrLink)] = $strLinkTem;

$strTitle = $arr2A_mat[3][$i];
if(preg_match("/TITLE=\"(.*?)\"/si", $strTitle, $arrTitle_mat))
$strTitle = $arrTitle_mat[1];
$arrTitle[count($arrTitle)] = $strTitle;

$strLinkAll = $strLinkAll.$strLinkTem."\r\n";

$strLinkAllTem = $strLinkAllTem.$arr2A_mat[2][$i]."\r\n";
}
}
// $strLinkAllTem = $strLinkAllTem.$arr2A_mat[2][$i]."\r\n";
}

return $strText;
}

func_globalscanlink("http://www.baidu.com/", $arrAhef, $arrLink, $arrTitle, $strLinkAll); //ioooo

这个函数可以把所有链接和标题都扫出来

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

求一段php抓取题目和超链接的代码

相关资料更多>

推荐阅读更多>