How to use Regular Expression in sql server?

前端 未结 2 933
抹茶落季
抹茶落季 2020-12-10 07:03

Is it possible to make efficient queries that use regular expression feature set.I got data in my table which is not in correct format,EX:-In Title colum: Cable 180â

相关标签:
2条回答
  • 2020-12-10 07:45

    You need to make use of the following. Usually requires combinations of the three:

    1. patindex
    2. charindex
    3. substring

    In response to your comment above, patindex should not 0 where the case is found. patindex finds the start location of the pattern specified, so if patindex finds the case, it should return an integer > 0.

    EDIT:

    Also, len(string) and reverse(string) come in handy on specific occasions.

    0 讨论(0)
  • 2020-12-10 07:46

    With the CLR and .NET project published to SQL Server it is EXTREMELY efficient. After starting to use a CLR Project in VB.Net with our 2005 SQL Server over the past 2 years I have found that every occurance of a Scalar Function in TSQL for which I have replaced with the .NET version it have dramatically improved performance times. I have used it for advanced date manipulation, formatting and parsing, String formatting and parsing, MD5 Hash generation, Vector lengths, String JOIN Aggragate function, Split Table Valued function, and even bulk loading from serialized datatables via a share folder (which is amazingly fast).

    For RegEx since it is not already present I can only assume it is as efficient as a compiled EXE would be doing the same REGEX, which is to say extremely fast.

    I will share a code file from my VB.Net CLR project that allows some RegEx functionality. This code would be part of a .NET CLR DLL that is published to your server.

    Function Summary

    Regex_IsMatch(Data,Parttern,Options) AS tinyint (0/1 result)

    Eg. SELECT dbo.Regex_IsMatch('Darren','[trwq]en$',NULL) -- returns 1 / true

    Regex_Group(data,pattern,groupname,options) as nvarchar(max) (capture group value returned)

    Eg. SELECT dbo.Regex_Group('Cable 180+e10 to 120+e3',' (?[0-9]+)+e[0-9]+','n',NULL) -- returns '180'

    Regex_Replace(data,pattern,replacement,options) as nvarchar(max) (returns modified string)

    Eg. SELECT dbo.Regex_Replace('Cable 180+e10 to 120+e3',' (?[0-9]+)+e(?[0-9]+)',' ${e}:${n]',NULL) -- returns 'Cable 10:180 to 3:120'

    Partial Public Class UserDefinedFunctions
    
        ''' <summary>
        ''' Returns 1 (true) or 0 (false) if a pattern passed is matched in the data passed.
        ''' Returns NULL if Data is NULL.
        ''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
        ''' </summary>
        ''' <param name="data"></param>
        ''' <param name="pattern"></param>
        ''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
        ''' <returns></returns>
        ''' <remarks></remarks>
        <Microsoft.SqlServer.Server.SqlFunction()> _
        Public Shared Function Regex_IsMatch(data As SqlChars, pattern As SqlChars, options As SqlString) As SqlByte
            If pattern.IsNull Then
                Throw New Exception("Pattern Parameter in ""RegEx_IsMatch"" cannot be NULL")
            End If
            If data.IsNull Then
                Return SqlByte.Null
            Else
                Return CByte(If(Regex.IsMatch(data.Value, pattern.Value, Regex_Options(options)), 1, 0))
            End If
        End Function
    
        ''' <summary>
        ''' Returns the Value of a RegularExpression Pattern Group by Name or Number.
        ''' Group needs to be captured explicitly. Example Pattern "[a-z](?&lt;m&gt;[0-9][0-9][0-9][0-9])" to capture the numeric portion of an engeneering number by the group called "m".
        ''' Returns NULL if The Capture was not successful.
        ''' Returns NULL if Data is NULL.
        ''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
        ''' </summary>
        ''' <param name="data"></param>
        ''' <param name="pattern"></param>
        ''' <param name="groupName">Name used in the explicit capture group</param>
        ''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
        <Microsoft.SqlServer.Server.SqlFunction()> _
        Public Shared Function Regex_Group(data As SqlChars, pattern As SqlChars, groupName As SqlString, options As SqlString) As SqlChars
            If pattern.IsNull Then
                Throw New Exception("Pattern Parameter in ""RegEx_IsMatch"" cannot be NULL")
            End If
            If groupName.IsNull Then
                Throw New Exception("GroupName Parameter in ""RegEx_IsMatch"" cannot be NULL")
            End If
            If data.IsNull Then
                Return SqlChars.Null
            Else
                Dim m As Match = Regex.Match(data.Value, pattern.Value, Regex_Options(options))
                If m.Success Then
                    Dim g As Group
                    If IsNumeric(groupName.Value) Then
                        g = m.Groups(CInt(groupName.Value))
                    Else
                        g = m.Groups(groupName.Value)
                    End If
                    If g.Success Then
                        Return New SqlChars(g.Value)
                    Else ' group did not return or was not found.
                        Return SqlChars.Null
                    End If
                Else 'match failed.
                    Return SqlChars.Null
                End If
            End If
        End Function
    
        ''' <summary>
        ''' Does the Equivalent toi Regex.Replace in .NET.
        ''' Replacement String Replacement Markers are done in this format "${test}" = Replaces the capturing group (?&lt;test&gt;...)
        ''' If the replacement pattern is $1 or $2 then it replaces the first or second captured group by position.
        ''' Returns NULL if Data is NULL.
        ''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
        ''' </summary>
        ''' <param name="data"></param>
        ''' <param name="pattern"></param>
        ''' <param name="replacement">Replacement String Replacement Markers are done in this format "${test}" = Replaces the capturing group (?&lt;test&gt;...). If the replacement pattern is $1 or $2 then it replaces the first or second captured group by position.</param>
        ''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
        ''' <returns></returns>
        ''' <remarks></remarks>
        <SqlFunction()> _
        Public Shared Function Regex_Replace(data As SqlChars, pattern As SqlChars, replacement As SqlChars, options As SqlString) As SqlChars
            If pattern.IsNull Then
                Throw New Exception("Pattern Parameter in ""Regex_Replace"" cannot be NULL")
            End If
            If replacement.IsNull Then
                Throw New Exception("Replacement Parameter in ""Regex_Replace"" cannot be NULL")
            End If
            If data.IsNull Then
                Return SqlChars.Null
            Else
                Return New SqlChars(Regex.Replace(data.Value, pattern.Value, replacement.Value, Regex_Options(options)))
            End If
        End Function
    
        ''' <summary>
        ''' Buffered list of options by name for speed.
        ''' </summary>
        Private Shared m_Regex_Buffered_Options As New Generic.Dictionary(Of String, RegexOptions)(StrComp)
        ''' <summary>
        ''' Default regex options used when options value is NULL or an Empty String
        ''' </summary>
        Private Shared ReadOnly m_Regex_DefaultOptions As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.ExplicitCapture Or RegexOptions.Multiline
    
        ''' <summary>
        ''' Get the regular expressions options to use by a passed string of data.
        ''' Formatted like command line arguments.
        ''' </summary>
        ''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline "</param>
        Private Shared Function Regex_Options(options As SqlString) As RegexOptions
            Return Regex_Options(If(options.IsNull, "", options.Value))
        End Function
    
        ''' <summary>
        ''' Get the regular expressions options to use by a passed string of data.
        ''' Formatted like command line arguments.
        ''' </summary>
        ''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
        Private Shared Function Regex_Options(options As String) As RegexOptions
            'empty options string is considered default options.
            If options Is Nothing OrElse options = "" Then
                Return m_Regex_DefaultOptions
            Else
                Dim out As RegexOptions
                If m_Regex_Buffered_Options.TryGetValue(options, out) Then
                    Return out
                Else
                    'must build options and store them
                    If options Like "*[/\-]n*" Then
                        out = RegexOptions.None
                    End If
                    If options Like "*[/\-]s*" Then
                        out = out Or RegexOptions.Singleline
                    End If
                    If options Like "*[/\-]m*" Then
                        out = out Or RegexOptions.Multiline
                    End If
                    If options Like "*[/\-]co*" Then
                        out = out Or RegexOptions.Compiled
                    End If
                    If options Like "*[/\-]c[ui]*" Then
                        out = out Or RegexOptions.CultureInvariant
                    End If
                    If options Like "*[/\-]ecma*" Then
                        out = out Or RegexOptions.ECMAScript
                    End If
                    If options Like "*[/\-]e[xc]*" Then
                        out = out Or RegexOptions.ExplicitCapture
                    End If
                    If options Like "*[/\-]i[c]*" OrElse options Like "*[/\-]ignorec*" Then
                        out = out Or RegexOptions.IgnoreCase
                    End If
                    If options Like "*[/\-]i[pw]*" OrElse options Like "*[/\-]ignore[pw]*" Then
                        out = out Or RegexOptions.IgnorePatternWhitespace
                    End If
                    If options Like "*[/\-]r[tl]*" Then
                        out = out Or RegexOptions.RightToLeft
                    End If
                    'store the options for next call (for speed)
                    m_Regex_Buffered_Options(options) = out
                    Return out
                End If
            End If
        End Function
    
    End Class
    
    | |
    0 讨论(0)
提交回复
热议问题