Perform regex (replace) in an SQL query

半城伤御伤魂 提交于 2019-11-27 08:04:27

Some hacking required but we can do this with LIKE, PATINDEX, LEFT AND RIGHT and good old string concatenation.

create table test
(
    id int identity(1, 1) not null,
    val varchar(25) not null
)

insert into test values ('&lt; <- ok, &lt <- nok')

while 1 = 1
begin
    update test
        set val = left(val, patindex('%&lt[^;]%', val) - 1) +
                      '&lt;' +
                      right(val, len(val) - patindex('%&lt[^;]%', val) - 2)
    from test
    where val like '%&lt[^;]%'

    IF @@ROWCOUNT = 0 BREAK
end

select * from test

Better is that this is SQL Server version agnostic and should work just fine.

I think this can be done much cleaner if you use different STUFF :)

create table test
(
    id int identity(1, 1) not null,
    val varchar(25) not null
)

insert into test values ('&lt; <- ok, &lt <- nok')

WHILE 1 = 1
BEGIN
    UPDATE test SET
        val = STUFF( val , PATINDEX('%&lt[^;]%', val) + 3 , 0 , ';' )
    FROM test
    WHERE val LIKE '%&lt[^;]%'

    IF @@ROWCOUNT = 0 BREAK
END

select * from test
ilitirit

How about:

    UPDATE tableName
    SET columName = REPLACE(columName , '&lt', '&lt;')
    WHERE columnName LIKE '%lt%'
    AND columnName NOT LIKE '%lt;%'

Edit:

I just realized this will ignore columns with partially correct &lt; strings.

In that case you can ignore the second part of the where clause and call this afterward:

    UPDATE tableName
    SET columName = REPLACE(columName , '&lt;;', '&lt;')

This article covers how to create a simple Regex Replace function that you can use in SQL 2000 (and 2005 with simple tweak) that can assist you.

If MSSQL's regex flavor supports negative lookahead, that would be The Right Way to approach this.

s/&lt(?!;)/&lt;/gi

will catch all instances of &lt which are not followed by a ; (even if they're followed by nothing, which [^;] would miss) and does not capture the following non-; character as part of the match, eliminating the issue mentioned in the comments on the original question of that character being lost in the replacement.

Unfortunately, I don't use MSSQL, so I have no idea whether it supports negative lookahead or not...

Very specific to this pattern, but I have done similar to this in the past:

REPLACE(REPLACE(columName, '&lt;', '&lt'), '&lt', '&lt;')

broader example (encode characters which may be inappropriate in a TITLE attribute)

REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
    columName
    -- Remove existing encoding:
    , '&amp;', '&')
    , '&#34;', '"')
    , '&#39;', '''')
    -- Reinstate/Encode:
    , '&', '&amp;')
    -- Encode:
    , '"', '&#34;')
    , '''', '&#39;')
    , ' ', '%20')
    , '<', '%3C')
    , '>', '%3E')
    , '/', '%2F')
    , '\', '%5C')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!