SQL Server 2008 query to find rows containing non-alphanumeric characters in a column

微笑、不失礼 提交于 2019-11-27 13:41:21

问题


I was actually asked this myself a few weeks ago, whereas I know exactly how to do this with a SP or UDF but I was wondering if there was a quick and easy way of doing this without these methods. I'm assuming that there is and I just can't find it.

A point I need to make is that although we know what characters are allowed (a-z, A-Z, 0-9) we don't want to specify what is not allowed (#@!$ etc...). Also, we want to pull the rows which have the illegal characters so that it can be listed to the user to fix (as we have no control over the input process we can't do anything at that point).

I have looked through SO and Google previously, but was unable to find anything that did what I wanted. I have seen many examples which can tell you if it contains alphanumeric characters, or doesn't, but something that is able to pull out an apostrophe in a sentence I have not found in query form.

Please note also that values can be null or '' (empty) in this varchar column.


回答1:


Won't this do it?

SELECT * FROM TABLE
WHERE COLUMN_NAME LIKE '%[^a-zA-Z0-9]%'

Setup

use tempdb
create table mytable ( mycol varchar(40) NULL)

insert into mytable VALUES ('abcd')
insert into mytable VALUES ('ABCD')
insert into mytable VALUES ('1234')
insert into mytable VALUES ('efg%^&hji')
insert into mytable VALUES (NULL)
insert into mytable VALUES ('')
insert into mytable VALUES ('apostrophe '' in a sentence') 

SELECT * FROM mytable
WHERE mycol LIKE '%[^a-zA-Z0-9]%'

drop table mytable 

Results

mycol
----------------------------------------
efg%^&hji
apostrophe ' in a sentence



回答2:


Sql server has very limited Regex support. You can use PATINDEX with something like this

PATINDEX('%[a-zA-Z0-9]%',Col)

Have a look at PATINDEX (Transact-SQL)

and Pattern Matching in Search Conditions




回答3:


I found this page with quite a neat solution. What makes it great is that you get an indication of what the character is and where it is. Then it gives a super simple way to fix it (which can be combined and built into a piece of driver code to scale up it's application).

DECLARE @tablename VARCHAR(1000) ='Schema.Table'
DECLARE @columnname VARCHAR(100)='ColumnName'
DECLARE @counter INT = 0
DECLARE @sql VARCHAR(MAX)

WHILE @counter <=255
BEGIN

SET @sql=

'SELECT TOP 10 '+@columnname+','+CAST(@counter AS VARCHAR(3))+' as CharacterSet, CHARINDEX(CHAR('+CAST(@counter AS VARCHAR(3))+'),'+@columnname+') as LocationOfChar
FROM '+@tablename+'
WHERE CHARINDEX(CHAR('+CAST(@counter AS VARCHAR(3))+'),'+@columnname+') <> 0'

PRINT (@sql)
EXEC (@sql)
SET @counter = @counter + 1
END

and then...

UPDATE Schema.Table
SET ColumnName= REPLACE(Columnname,CHAR(13),'')

Credit to Ayman El-Ghazali.



来源:https://stackoverflow.com/questions/1919757/sql-server-2008-query-to-find-rows-containing-non-alphanumeric-characters-in-a-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!