日期:2014-05-18  浏览次数:20403 次

开帖提问采集数据的问题
我要采集的网站的网址是
http://zoldata.finet.cn/h_stock_data/h_stock.php?code=3838&stock_name=中国淀粉&stock_en_name=CHINA%20STARCH&new_code=03838

如果大家不想看网址的话,我把源代码贴出来
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Cache-Control" content="no-store"/>
<meta http-equiv="Pragma" content="no-cache"/>
<meta http-equiv="Expires" content="0"/>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title></title>

<link href="/css/stock.css" rel="stylesheet" type="text/css" />
<script language="javascript" src="/js/stock_info.js"></script>
<script language="javascript" type="text/javascript">
var stock_code="3838";
var stock_type="hk";
var stock_name="中国淀粉";
var stock_type = '';
$(document).ready(function() {
refresh_s('3838');
window.setInterval("refresh_s('3838')",30000);
});
//show_user_search();
</script>
<script language="javascript" src="/js/stock_function.js"></script>
<script language="javascript" src="/js/commajax.js"></script>
</head>

<div class="Bm_1_1">
<div class="f20" id="chinese_name">中国淀粉&nbsp;</div>
<div id="english_name">03838.hk (CHINA STARCH)</div>
<div class="r" id="stock_time"></div> 
</div>
<div class="Bm_1_2 lv12"><b class="f20" id="stock_last">读取中...</b><br />
<span id="stock_change">0.000</span> (<span id="stock_changerate">0.000</span>)</div>
<ul class="Bm_1_3">
<li>昨收盘:<span id="previous_close">0.000</span></li>
<li>今开盘:<span id="today_open" class="lv12">0.000</span></li>
<li>最高价:<span id="stock_high">0.000</span></li>
<li>最低价:<span id="stock_low">0.000</span></li>
<li>成交额:<span id="stock_turnover">0</span></li>
<li>成交量:<span id="stock_volume">0</span></li>
<li>买入价:<span id="stock_bid">0.000</span></li>
<li>卖出价:<span id="stock_ask" class="ho12">0.000</span> </li>
<li>市盈率:<span id="stock_pe">0.000</span></li>
<li>收益率:<span id="stock_yield">0.000</span></li>
<li>52周最高:<span id="high_52">0.000</span></li>
<li>52周最低:<span id="low_52">0.000</span></li>
</ul>

</body>
</html>

我想采集从昨收盘,到52周最低的<li></li>里面的信息,就是采集里面的数据放了我的网站的一个模块里,请问代码该如何去写?

------解决方案--------------------
C# code

        //using System.Text.RegularExpressions;
        //using System.Net;
        string url = "http://zoldata.finet.cn/h_stock_data/h_stock.php?code=3838&stock_name=中国淀粉&stock_en_name=CHINA%20STARCH&new_code=03838";
        WebClient webClient = new WebClient();
        byte[] b =