T-SQL Regex for social security number (SQL Server 2008 R2)

微笑、不失礼 提交于 2019-12-24 05:37:08

问题


I need to find invalid social security numbers in a varchar field in a SQL Server 2008 database table. (Valid SSNs are being defined by being in the format ###-##-#### - doesn't matter what the numbers are, as long as they are in that "3-digit dash 2-digit dash 4-digit" pattern.

I do have a working regex:

SELECT * 
FROM mytable
WHERE ssn NOT LIKE '[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]'

That does find the invalid SSNs in the column, but I know (okay - I'm pretty sure) that there is a way to shorten that to indicate that the previous pattern can have x iterations.

I thought this would work:

'[0-9]{3}-[0-9]{2}-[0-9]{4}'

But it doesn't.

Is there a shorter regex than the one above in the select, or not? Or perhaps there is, but T-SQL/SQL Server 2008 doesn't support it!?


回答1:


If you plan to get a shorter variant of your LIKE expression, then the answer is no.

In T-SQL, you can only use the following wildcards in the pattern:

%
- Any string of zero or more characters. WHERE title LIKE '%computer%' finds all book titles with the word computer anywhere in the book title.

_ (underscore)
Any single character. WHERE au_fname LIKE '_ean' finds all four-letter first names that end with ean (Dean, Sean, and so on).
[ ]
Any single character within the specified range ([a-f]) or set ([abcdef]). WHERE au_lname LIKE '[C-P]arsen' finds author last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, and so on. In range searches, the characters included in the range may vary depending on the sorting rules of the collation.
[^]
Any single character not within the specified range ([^a-f]) or set ([^abcdef]).

So, your LIKE statement is already the shortest possible expression. No limiting quantifiers can be used (those like {min,max}), not shorthand classes like \d.

If you were using MySQL, you could use a richer set of regex utilities, but it is not the case.




回答2:


I suggest you to use another solution like this:

-- Use `REPLICATE` if you really want to use a number to repeat
Declare @rgx nvarchar(max) = REPLICATE('#', 3) + '-' +
                             REPLICATE('#', 2) + '-' +
                             REPLICATE('#', 4);

-- or use your simple format string
Declare @rgx nvarchar(max) = '###-##-####';

-- then use this to get your final `LIKE` string.
Set @rgx = REPLACE(@rgx, '#', '[0-9]');

And you can also use something like '_' for characters then replace it with [A-Z] and so on.



来源:https://stackoverflow.com/questions/31788137/t-sql-regex-for-social-security-number-sql-server-2008-r2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!