日期:2014-05-18  浏览次数:20572 次

分享 SQL SERVER 中运用完整的正则表达式——RegexWorkbench2000 之一 test(or match)
很久以前,在vb里用正则regexp组件,处理数据很爽快,后来在VBA里也是,vbscript regular expression 5.5
后来我就想知道数据库里是不是也支持,于是我找到了下面的正则工具,是一群强大的外国sql server 程序员开发的
其强大之处,用过正则的人会明白,支持test,replace,特别是支持execute!!! 在程序中,execute是返回一个对象,他有各种属性,方法,子成员。而在数据库中,这些是放在了相互关联的表里,太有才了!!

不过正如作者所说,CLR会更快一些,而且说实在的,正则更趋于对行数据进行处理,所以我在用的时候觉得挺慢的,虽然其功能很绚丽。sql server 里没有vi来辅助工作,用用正则算是一个弥补吧。

以下代码在sql server 2000中已测试,其中数据类型有需要修改的 2000中不支持nvarchar(max),把这个一改就可以运行了

在此向国外的同行学习,致敬!

字数限制,按照实现的功能分开贴了,分别是:test方法,replace方法,和最强大的execute方法
SQL code

/*
This Workbench is about using Regular expressions with SQL Server via TSQL. It doesn't even attempt to teach how regular expressions work or how to pull them together. The aim is to demonstrate a few possibilities and try to persuade you to experiment with them if you don't already use Regex with SQL Server.

We suggest that, if you are an ordinary mortal without special powers like Phil and I, you should  use an application such as http://www.regexbuddy.com/ RegexBuddy to form, edit and interpret Regular expressions, as it makes learning them a lot easier. In order that people with access only to SQL Server 2000 can use the workbench, we'll use OLE in the examples, but they are readily adapted to CLR 

Contents
---------

Introduction
The OLE Functions
    The OLE Regex  Match function
    The OLE Regex  Replace function
    The OLE Regex Find (Execute) function
Combining two Regexs
Regex Performance


Regular Expressions can be very useful to the Database programmer, particularly for data validation, data feeds and data transformations.

Regular Expressions are not regular in the sense that there is any common dialect of expression that is understood by all Regex engines. On the contrary, regular expresssions aren't always portable and there are many common, similar but incompatible, dialects in use, such as Perl 5.8, Java.util.regex, .NET, PHP, Python, Ruby, ECMA Javascript, PCRE, Apache, vi, Shell tools TCL ARE, POSIX BRE, Funduc and JGsoft. 

Regular Expressions were never developed to be easy to understand. They are a condensed shorthand that, on preliminary inspection, looks as if someone has repeatedly sat on the keyboard. Even when interpreted, the logic isn't always easy to follow. If you don't agree, then explain this one! http://aspn.activestate.com/ASPN/Cookbook/Rx/Recipe/59864

See http://www.simple-talk.com/dotnet/.net-framework/implementing-real-world-data-input-validation-using-regular-expressions/ for an introduction to regular expressions

A great deal can be done using commandline applications that work with regular expressions such as GREP and AWK. However, there are times where it is handy to use Regex directly from TSQL. There are two Regex engines available to SQL Server. These are 
the .NET Regex which is in the system.text.regularexpression module
The ECMA Regex from VBScript.RegExp which is distributed with the IE browser and is used by Javascript and JScript.

Both of these are excellent standard implementations. Both work well in TSQL. 

The .NET Regex requires the creation of CLR functions to provide regular expressions, and works only with SQL Server 2005, (and 2007) http://www.sqlservercentral.com/articles/Development/clrintegration/1967/
The ECMA Regex can be used via VBScript.RegExp, which are available to SQL Server 2000 as well. The regex is compatible with Javascript.

The advantage of using CLR is that the regular expressions of the NET framework are very good, and performance is excellent. However, the techniques are well-known, whereas some of the more powerful uses of VBScript.RegExp have hardly ever been published, so this workbench will concentrate on the latter

The OLE functions
------------------
There are various properties to consider in these functi