Wrapping and removing CDATA around XML

…衆ロ難τιáo~ 提交于 2019-11-29 18:02:04

If you need full control over generating XML, you can use FOR XML EXPLICIT:

DECLARE @xml xml = '<Custom>
     <Table>Shape</Table>
     <Column>CustomScreen</Column>
     <Value>Data</Value>
</Custom>';

WITH rawValues AS
(
    SELECT
        n.value('Table[1]', 'nvarchar(20)') [Table],
        n.value('Column[1]', 'nvarchar(20)') [Column],
        n.value('Value[1]', 'nvarchar(20)') [Value]
    FROM @xml.nodes('Custom') X(n)
)
SELECT 1 AS Tag,
       NULL AS Parent,
       [Table] AS [Custom!1!Table!ELEMENT],
       [Column] AS [Custom!1!Column!ELEMENT],
       [Value] AS [Custom!1!Value!CDATA]
FROM rawValues 
FOR XML EXPLICIT

It generates:

<Custom>
  <Table>Shape</Table>
  <Column>CustomScreen</Column>
  <Value><![CDATA[Data]]></Value>
</Custom>

If you need reverse, replace source XML and use ELEMENT instead of CDATA.

Shnugo

If you really need the CDATA section within your XML, there are only two options

  • string concatenation (very bad)
  • FOR XML EXPLICIT (in this case you've got the answer from Pawel)

But you should take into consideration, that the CDATA section exists for lazy input only. There is absolutely no difference whether the content is enclosed as CDATA section or properly escaped. Therefore Microsoft decided not even to support the CDATA syntax in modern XML methods. It is just not needed.

Look at these examples:

--I start with a string containing the same content escaped and in CDATA

DECLARE @s VARCHAR(500)=
'<root>
<a>Normal Text</a>
<a>Text with forbidden character &amp; &lt;&gt;</a>
<a><![CDATA[Text with forbidden character & <>]]></a>
</root>';

--This string is casted to XML.

DECLARE @x XML=CAST(@s AS XML);

--This is the output, and you can see, that the CDATA section is encoded an no CDATA any more. CDATA will always be replaced by a valid escaped string:

SELECT @x;

<root>
  <a>Normal Text</a>
  <a>Text with forbidden character &amp; &lt;&gt;</a>
  <a>Text with forbidden character &amp; &lt;&gt;</a>
</root>

--The back-cast shows clearly, that the XML internally has no CDATA any more

SELECT CAST(@x AS VARCHAR(500));

<root>
   <a>Normal Text</a>
   <a>Text with forbidden character &amp; &lt;&gt;</a>
   <a>Text with forbidden character &amp; &lt;&gt;</a>
</root>

--Reading the nodes one-by-one shows the correct content anyway

SELECT a.value('.','varchar(max)')
FROM @x.nodes('/root/a') AS A(a)

Normal Text
Text with forbidden character & <>
Text with forbidden character & <>

The only reason to use CDATA and to insist, that this must be included into the XML's text representation (which is not the XML!) are third party or legacy requirements.

And keep in mind: If you use string concatenation, you can store the XML with a readable CDATA in a string format only. Whenever you cast this to XML the CDATA will be ommited. Using FOR XML EXPLICIT allows the typesafe storage, but is very clumsy with deeper nestings. This might be OK with an external interface, but you should think twice about this...

Two links to related answers (by me :-) ):

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!