SQL Server XML String Manipluation

微笑、不失礼 提交于 2019-12-11 13:36:25

问题


Let me state I am an XML novice. That said, my issue is I have a SQL Server that creates XML data, and places that into a file that must pass through a security gate to another server. The gate has a list of several "dirty"words that will cause the files to fail if they are included. What I need, is a way for SQL to search the XML data, every node, and if the "dirty" value is present, strip it out (replace with blank). The XML is not strongly typed, and the "dirty"word could possibly be part of a longer string. In that case, the rest of the string must remain intact.

For example, if the "dirty" word is "hold," the string "We hold these truths to be self evident" would become "We these truths to be self evident."

Again, this "dirty" word could be in any node, and the tags will not always be the same. I need to write a procedure or trigger that analyzes the XML value based on the dirty word list to clean it up.


回答1:


Shred the XML to a table with one row for each node. The table needs an id that corresponds to the position of the node in the shredded XML to be able to write back the changes.

Have your bad words in a table and for each word use replace to remove them from the table with the nodes values.

Finally you loop through the cleaned values and write them back to the XML one node at a time for the nodes that was actually modified.

-- A table to hold the bad words
declare @BadWords table
(
  ID int identity,
  Value nvarchar(10)
)

-- These are the bad ones.
insert into @BadWords values
('one'),
('three'),
('five'),
('hold')

-- XML that needs cleaning
declare @XML xml = '
<root>
  <itemone ID="1one1">1one1</itemone>
  <itemtwo>2two2</itemtwo>
  <items>
    <item>1one1</item>
    <item>2two2</item>
    <item>onetwothreefourfive</item>
  </items>
  <hold>We hold these truths to be self evident</hold>
</root>
'

-- A helper table to hold the values to modify
declare @T table
(
  ID int identity,
  Pos int,
  OldValue nvarchar(max),
  NewValue nvarchar(max),
  Attribute bit
)

-- Get all attributes from the XML
insert into @T(Pos, OldValue, NewValue, Attribute)
select row_number() over(order by T.N),
       T.N.value('.', 'nvarchar(max)'),
       T.N.value('.', 'nvarchar(max)'),
       1
from @XML.nodes('//@*') as T(N)

-- Get all values from the XML
insert into @T(Pos, OldValue, NewValue, Attribute)
select row_number() over(order by T.N),
       T.N.value('text()[1]', 'nvarchar(max)'),
       T.N.value('text()[1]', 'nvarchar(max)'),
       0
from @XML.nodes('//*') as T(N)

declare @ID int
declare @Pos int
declare @Value nvarchar(max)
declare @Attribute bit

-- Remove the bad words from @T, one bad word at a time
select @ID = max(ID) from @BadWords
while @ID > 0
begin
  select @Value = Value
  from @BadWords
  where ID = @ID

  update @T
  set NewValue = replace(NewValue, @Value, '')

  set @ID -= 1
end

-- Write the cleaned values back to the XML
select @ID = max(ID) from @T
while @ID > 0
begin
  select @Value = nullif(NewValue, OldValue),
         @Attribute = Attribute,
         @Pos = Pos
  from @T
  where ID = @ID

  print @Attribute

  if @Value is not null
    if @Attribute = 1  
      set @XML.modify('replace value of ((//@*)[sql:variable("@Pos")])[1] 
                       with sql:variable("@Value")')
    else
      set @XML.modify('replace value of ((//*)[sql:variable("@Pos")]/text())[1] 
                           with sql:variable("@Value")')
  set @ID -= 1
end

select @XML

Note: In some cases the code above will not deal with values where the modification itself creates the bad value.

<item>fioneve</item>

will be modified to

<item>five</item>


来源:https://stackoverflow.com/questions/15126430/sql-server-xml-string-manipluation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!