T-SQL strip all non-alpha and non-numeric characters

后端 未结 5 1800
予麋鹿
予麋鹿 2020-12-03 12:48

Is there a smarter way to remove all special characters rather than having a series of about 15 nested replace statements?

The following works, but only handles thr

5条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-03 13:26

    I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:

    IF (object_id('dbo.fn_CleanString') IS NOT NULL)
    BEGIN
      PRINT 'Dropping: dbo.fn_CleanString'
      DROP function dbo.fn_CleanString
    END
    GO
    PRINT 'Creating: dbo.fn_CleanString'
    GO
    CREATE FUNCTION dbo.fn_CleanString 
    (
      @string varchar(8000)
    ) 
    returns varchar(8000)
    AS
    BEGIN
    ---------------------------------------------------------------------------------------------------
    -- Title:        CleanString
    -- Date Created: March 26, 2011
    -- Author:       William McEvoy
    --               
    -- Description:  This function removes special ascii characters from a string.
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    
    declare @char        char(1),
            @len         int,
            @count       int,
            @newstring   varchar(8000),
            @replacement char(1)
    
    select  @count       = 1,
            @len         = 0,
            @newstring   = '',
            @replacement = ' '
    
    
    
    ---------------------------------------------------------------------------------------------------
    -- M A I N   P R O C E S S I N G
    ---------------------------------------------------------------------------------------------------
    
    
    -- Remove Backspace characters
    select @string = replace(@string,char(8),@replacement)
    
    -- Remove Tabs
    select @string = replace(@string,char(9),@replacement)
    
    -- Remove line feed
    select @string = replace(@string,char(10),@replacement)
    
    -- Remove carriage return
    select @string = replace(@string,char(13),@replacement)
    
    
    -- Condense multiple spaces into a single space
    -- This works by changing all double spaces to be OX where O = a space, and X = a special character
    -- then all occurrences of XO are changed to O,
    -- then all occurrences of X  are changed to nothing, leaving just the O which is actually a single space
    select @string = replace(replace(replace(ltrim(rtrim(@string)),'  ', ' ' + char(7)),char(7)+' ',''),char(7),'')
    
    
    --  Parse each character, remove non alpha-numeric
    
    select @len = len(@string)
    
    WHILE (@count <= @len)
    BEGIN
    
      -- Examine the character
      select @char = substring(@string,@count,1)
    
    
      IF (@char like '[a-z]') or (@char like '[A-Z]') or (@char like '[0-9]')
        select @newstring = @newstring + @char
      ELSE
        select @newstring = @newstring + @replacement
    
      select @count = @count + 1
    
    END
    
    
    return @newstring
    END
    
    GO
    IF (object_id('dbo.fn_CleanString') IS NOT NULL)
      PRINT 'Function created.'
    ELSE
      PRINT 'Function NOT created.'
    GO
    

提交回复
热议问题