How to strip all non-alphabetic characters from string in SQL Server?

后端 未结 18 1626
情深已故
情深已故 2020-11-21 23:49

How could you remove all characters that are not alphabetic from a string?

What about non-alphanumeric?

Does this have to be a custom function or are there

18条回答
  •  深忆病人
    2020-11-22 00:26

    Here's a solution that doesn't require creating a function or listing all instances of characters to replace. It uses a recursive WITH statement in combination with a PATINDEX to find unwanted chars. It will replace all unwanted chars in a column - up to 100 unique bad characters contained in any given string. (E.G. "ABC123DEF234" would contain 4 bad characters 1, 2, 3 and 4) The 100 limit is the maximum number of recursions allowed in a WITH statement, but this doesn't impose a limit on the number of rows to process, which is only limited by the memory available.
    If you don't want DISTINCT results, you can remove the two options from the code.

    -- Create some test data:
    SELECT * INTO #testData 
    FROM (VALUES ('ABC DEF,K.l(p)'),('123H,J,234'),('ABCD EFG')) as t(TXT)
    
    -- Actual query:
    -- Remove non-alpha chars: '%[^A-Z]%'
    -- Remove non-alphanumeric chars: '%[^A-Z0-9]%'
    DECLARE @BadCharacterPattern VARCHAR(250) = '%[^A-Z]%';
    
    WITH recurMain as (
        SELECT DISTINCT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
        FROM #testData
        UNION ALL
        SELECT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
        FROM (
            SELECT 
                CASE WHEN BadCharIndex > 0 
                    THEN REPLACE(TXT, SUBSTRING(TXT, BadCharIndex, 1), '')
                    ELSE TXT 
                END AS TXT
            FROM recurMain
            WHERE BadCharIndex > 0
        ) badCharFinder
    )
    SELECT DISTINCT TXT
    FROM recurMain
    WHERE BadCharIndex = 0;
    

提交回复
热议问题