问题
Let me state I am an XML novice. That said, my issue is I have a SQL Server that creates XML data, and places that into a file that must pass through a security gate to another server. The gate has a list of several "dirty"words that will cause the files to fail if they are included. What I need, is a way for SQL to search the XML data, every node, and if the "dirty" value is present, strip it out (replace with blank). The XML is not strongly typed, and the "dirty"word could possibly be part of a longer string. In that case, the rest of the string must remain intact.
For example, if the "dirty" word is "hold," the string "We hold these truths to be self evident" would become "We these truths to be self evident."
Again, this "dirty" word could be in any node, and the tags will not always be the same. I need to write a procedure or trigger that analyzes the XML value based on the dirty word list to clean it up.
回答1:
Shred the XML to a table with one row for each node. The table needs an id that corresponds to the position of the node in the shredded XML to be able to write back the changes.
Have your bad words in a table and for each word use replace to remove them from the table with the nodes values.
Finally you loop through the cleaned values and write them back to the XML one node at a time for the nodes that was actually modified.
-- A table to hold the bad words
declare @BadWords table
(
ID int identity,
Value nvarchar(10)
)
-- These are the bad ones.
insert into @BadWords values
('one'),
('three'),
('five'),
('hold')
-- XML that needs cleaning
declare @XML xml = '
<root>
<itemone ID="1one1">1one1</itemone>
<itemtwo>2two2</itemtwo>
<items>
<item>1one1</item>
<item>2two2</item>
<item>onetwothreefourfive</item>
</items>
<hold>We hold these truths to be self evident</hold>
</root>
'
-- A helper table to hold the values to modify
declare @T table
(
ID int identity,
Pos int,
OldValue nvarchar(max),
NewValue nvarchar(max),
Attribute bit
)
-- Get all attributes from the XML
insert into @T(Pos, OldValue, NewValue, Attribute)
select row_number() over(order by T.N),
T.N.value('.', 'nvarchar(max)'),
T.N.value('.', 'nvarchar(max)'),
1
from @XML.nodes('//@*') as T(N)
-- Get all values from the XML
insert into @T(Pos, OldValue, NewValue, Attribute)
select row_number() over(order by T.N),
T.N.value('text()[1]', 'nvarchar(max)'),
T.N.value('text()[1]', 'nvarchar(max)'),
0
from @XML.nodes('//*') as T(N)
declare @ID int
declare @Pos int
declare @Value nvarchar(max)
declare @Attribute bit
-- Remove the bad words from @T, one bad word at a time
select @ID = max(ID) from @BadWords
while @ID > 0
begin
select @Value = Value
from @BadWords
where ID = @ID
update @T
set NewValue = replace(NewValue, @Value, '')
set @ID -= 1
end
-- Write the cleaned values back to the XML
select @ID = max(ID) from @T
while @ID > 0
begin
select @Value = nullif(NewValue, OldValue),
@Attribute = Attribute,
@Pos = Pos
from @T
where ID = @ID
print @Attribute
if @Value is not null
if @Attribute = 1
set @XML.modify('replace value of ((//@*)[sql:variable("@Pos")])[1]
with sql:variable("@Value")')
else
set @XML.modify('replace value of ((//*)[sql:variable("@Pos")]/text())[1]
with sql:variable("@Value")')
set @ID -= 1
end
select @XML
Note: In some cases the code above will not deal with values where the modification itself creates the bad value.
<item>fioneve</item>
will be modified to
<item>five</item>
来源:https://stackoverflow.com/questions/15126430/sql-server-xml-string-manipluation