日期:2014-05-17  浏览次数:21179 次

新闻采集回来的数据怎么去掉javascript
Function   NoHtml(str)  
Dim   re  
str=(str)
Set   re=new   RegExp  
re.IgnoreCase   =True  
re.Global=True  
re.Pattern= "(\ <.[^\ <]*\> ) "  
str=re.Replace(str, " ")  
re.Pattern= "(\ <\/[^\ <]*\> ) "  
str=re.Replace(str, " ")  
NoHtml=str  
Set   re=Nothing  
End   Function  

只能去了HTML标志

<script> alert( "我想去掉这里 "); </scrip>
结果
alert( "我想去掉这里 ");还没有被去掉

还有编码
Function   getHTTPPage(url)
Dim   Http
Set   Http=Server.CreateObject( "MSXML2.XMLHTTP ")
Http.open   "GET ",url,False  
Http.send()
If   Http.readystate <> 4   Then   Exit   Function  
getHTTPPage=bytesToBSTR(Http.responseBody, "gb2312 ")
Set   http=Nothing  
If   Err.number <> 0   Then   Err.Clear  
End   Function
这个只能采gb2312的编码,utf-8的就乱码了
怎么解决呢#

------解决方案--------------------
VBS应该有一些字符串函数可以用吧(我不太清楚):

把整个的HTML文本看作一个字符串,然后用字符串函数查找到第一个子串 " <script> "(转小写或大写)的位置,再查找到 " </script> "的位置,把两个位置之间的子串删掉,然后把清理后的字符串存为HTML文件或你想要的文件格式(fileObject可以做吧,我也不是很清楚)

^_^
------解决方案--------------------
<textarea id=textarea1>
wwwww <script> alert( "我想去掉这里 ");alert( "我想去掉这里 ");alert( "我想去掉这里 "); </script>

这个也去不掉呀..
</textarea>

<script>
var str=textarea1.value;
var re=/ <script> [\s\S]*? <\/script> /g;
re.test(str)
str=str.replace(re, " ");
alert(str);
</script>
------解决方案--------------------
Function RemoveHTML(strText)
Dim TAGLIST
TAGLIST = ";!--;!DOCTYPE;A;ACRONYM;ADDRESS;APPLET;AREA;B;BASE;BASEFONT; " &_
"BGSOUND;BIG;BLOCKQUOTE;BODY;BR;BUTTON;CAPTION;CENTER;CITE;CODE; " &_
"COL;COLGROUP;COMMENT;DD;DEL;DFN;DIR;DIV;DL;DT;EM;EMBED;FIELDSET; " &_
"FONT;FORM;FRAME;FRAMESET;HEAD;H1;H2;H3;H4;H5;H6;HR;HTML;I;IFRAME;IMG; " &_
"INPUT;INS;ISINDEX;KBD;LABEL;LAYER;LAGEND;LI;LINK;LISTING;MAP;MARQUEE; " &_
"MENU;META;NOBR;NOFRAMES;NOSCRIPT;OBJECT;OL;OPTION;P;PARAM;PLAINTEXT; " &_
"PRE;Q;S;SAMP;SCRIPT;Select;SMALL;SPAN;STRIKE;STRONG;STYLE;SUB;SUP; " &_
"TABLE;TBODY;TD;TEXTAREA;TFOOT;TH;THEAD;TITLE;TR;TT;U;UL;VAR;WBR;XMP; "
Const BLOCKTAGLIST = ";APPLET;EMBED;FRAMESET;HEAD;NOFRAMES;NOSCRIPT;OBJECT;SCRIPT;STYLE; "
Dim nPos1
Dim nPos2
Dim nPos3
Dim strResult
Dim strTagName
Dim bRemove
Dim bSearchForBlock
nPos1 = InStr(strText, " < ")
Do While nPos1 > 0
nPos2 = InStr(nPos1 + 1, strText, "> ")
If nPos2 > 0 Then
strTagName = Mid(strText, nPos1 + 1, nPos2 - nPos1 - 1)
strTagName = Replace(Replace(strTagName, vbCr, " "), vbLf, " ")
nPos3 = InStr(strTagName, " ")
If nPos3 > 0 The