XML parsing illegal character in sql server

谁都会走 提交于 2020-07-21 07:15:07

问题


I am greeted with an illegal xml character error when parsing a table record into xml.

SELECT 
    mb.ProductTitle,mb.ProductDescription,
    CAST((
        SELECT
            Id,                             
            ProductDescription,
        FROM ProductsManagement AS mpm
        WHERE mpm.MattressId = 6
        FOR XML PATH('ProductItemListModel'), 
        ROOT('MattressBarndProductItemList'))as XML)
FROM Brands AS mb
WHERE mb.Id = 6
FOR XML PATH(''), ROOT('ProductModel')

or

SELECT CONVERT(XML,'lift')

The record for the description is as follows:

Ease™ by Sealy adjustable base is the simple way to turn your bed into the perfect place to relax. The wireless remote controls the head and leg lift, for virtually unlimited range of ergonomic positions."

The above is not parsing into xml.


回答1:


This is because there is a list of known illegal characters in XML standard(s). Mostly those characters are not even visible, for instance a "terminal bell", or CHAR(7). Such character and other from the list will cause that error you now encounter.

There are few workarounds available, but all of them is about removing illegal chars.

Following example is based on a scalar function approach, therefore warning: it can perform slow on large amounts of data:

CREATE FUNCTION [dbo].RemoveInvalidXMLCharacters (@InputString VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
    IF @InputString IS NOT NULL
    BEGIN
      DECLARE @Counter INT, @TestString NVARCHAR(40)

      SET @TestString = '%[' + NCHAR(0) + NCHAR(1) + NCHAR(2) + NCHAR(3) + NCHAR(4) + NCHAR(5) + NCHAR(6) + NCHAR(7) + NCHAR(8) + NCHAR(11) + NCHAR(12) + NCHAR(14) + NCHAR(15) + NCHAR(16) + NCHAR(17) + NCHAR(18) + NCHAR(19) + NCHAR(20) + NCHAR(21) + NCHAR(22) + NCHAR(23) + NCHAR(24) + NCHAR(25) + NCHAR(26) + NCHAR(27) + NCHAR(28) + NCHAR(29) + NCHAR(30) + NCHAR(31) + ']%'

      SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)

      WHILE @Counter <> 0
      BEGIN
        SELECT @InputString = STUFF(@InputString, @Counter, 1, ' ')
        SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
      END
    END
    RETURN(@InputString)
END

So, adjusted query will be similar to:

SELECT 
    [dbo].RemoveInvalidXMLCharacter(smb.ProductTitle) as ProductTitle
,   [dbo].RemoveInvalidXMLCharacter(mb.ProductDescription) as ProductDescription
,    CAST((
        SELECT
            Id,                             
            [dbo].RemoveInvalidXMLCharacter(ProductDescription) ProductDescription
        FROM ProductsManagement AS mpm
        WHERE mpm.MattressId = 6
        FOR XML PATH('ProductItemListModel'), ROOT('MattressBarndProductItemList'))as XML)
FROM Brands AS mb
WHERE mb.Id = 6
FOR XML PATH(''), ROOT('ProductModel')

Another method is an conversation to VARBINARY and back described also in this linked topic: Invalid Characters in XML



来源:https://stackoverflow.com/questions/54693134/xml-parsing-illegal-character-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!