关于脏字典过滤问题－用正则表达式来过滤脏数据-ASP.NET教程-爱易网页

关于脏字典过滤问题－用正则表达式来过滤脏数据

日期：2010-11-16　浏览次数：20612 次

方法一：使用正则表达式

1//脏字典数据存放文件路径
2        private static string FILE_NAME="zang.txt";
3        //脏数据字典表，如：脏数据一|脏数据二|脏数据三
4        public static string dirtyStr="";
5
6        public ValidDirty()
7        {
8            if (HttpRuntime.Cache["Regex"]==null)
9            {
10                dirtyStr=ReadDic();
11                //用于检测脏字典的正则表达式
12                Regex validateReg= new Regex("^((?!"+dirtyStr+").(?<!"+dirtyStr+"))*$",RegexOptions.Compiled|RegexOptions.ExplicitCapture);
13                HttpRuntime.Cache.Insert("Regex" ,validateReg,null,DateTime.Now.AddMinutes(20) ,TimeSpan.Zero);
14            }
15
16        }
17        private string ReadDic()
18        {
19            FILE_NAME=Environment.CurrentDirectory+"\\"+FILE_NAME;
20
21            if (!File.Exists(FILE_NAME))
22            {
23                Console.WriteLine("{0} does not exist.", FILE_NAME);
24                return "";
25            }
26            StreamReader sr = File.OpenText(FILE_NAME);
27            String input="";
28            while (sr.Peek() > -1)
29            {
30                input += sr.ReadLine() ;
31            }
32
33            sr.Close();
34            return input;
35
36        }
37
38
39        public bool ValidByReg(string str)
40        {
41            Regex reg=(Regex)HttpRuntime.Cache["Regex"];
42            return reg.IsMatch(str) ;
43
44        }

感觉这种方法的执行效率不是很高，简单的测试了一下 1000字的文章，脏字典有800多个关键字
式了一下是 1.238秒，大家有没有更好的方法，请不吝赐教！

方法二：普通循环查找方法

public bool ValidGeneral(string str)
&nb

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

关于脏字典过滤问题－用正则表达式来过滤脏数据

相关资料更多>

推荐阅读更多>