问题
I am greeted with an illegal xml character
error when parsing a table record into xml.
SELECT
mb.ProductTitle,mb.ProductDescription,
CAST((
SELECT
Id,
ProductDescription,
FROM ProductsManagement AS mpm
WHERE mpm.MattressId = 6
FOR XML PATH('ProductItemListModel'),
ROOT('MattressBarndProductItemList'))as XML)
FROM Brands AS mb
WHERE mb.Id = 6
FOR XML PATH(''), ROOT('ProductModel')
or
SELECT CONVERT(XML,'lift')
The record for the description is as follows:
Ease™ by Sealy adjustable base is the simple way to turn your bed into the perfect place to relax. The wireless remote controls the head and leg lift, for virtually unlimited range of ergonomic positions."
The above is not parsing into xml.
回答1:
This is because there is a list of known illegal characters in XML standard(s). Mostly those characters are not even visible, for instance a "terminal bell", or CHAR(7). Such character and other from the list will cause that error you now encounter.
There are few workarounds available, but all of them is about removing illegal chars.
Following example is based on a scalar function approach, therefore warning: it can perform slow on large amounts of data:
CREATE FUNCTION [dbo].RemoveInvalidXMLCharacters (@InputString VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
IF @InputString IS NOT NULL
BEGIN
DECLARE @Counter INT, @TestString NVARCHAR(40)
SET @TestString = '%[' + NCHAR(0) + NCHAR(1) + NCHAR(2) + NCHAR(3) + NCHAR(4) + NCHAR(5) + NCHAR(6) + NCHAR(7) + NCHAR(8) + NCHAR(11) + NCHAR(12) + NCHAR(14) + NCHAR(15) + NCHAR(16) + NCHAR(17) + NCHAR(18) + NCHAR(19) + NCHAR(20) + NCHAR(21) + NCHAR(22) + NCHAR(23) + NCHAR(24) + NCHAR(25) + NCHAR(26) + NCHAR(27) + NCHAR(28) + NCHAR(29) + NCHAR(30) + NCHAR(31) + ']%'
SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
WHILE @Counter <> 0
BEGIN
SELECT @InputString = STUFF(@InputString, @Counter, 1, ' ')
SELECT @Counter = PATINDEX (@TestString, @InputString COLLATE Latin1_General_BIN)
END
END
RETURN(@InputString)
END
So, adjusted query will be similar to:
SELECT
[dbo].RemoveInvalidXMLCharacter(smb.ProductTitle) as ProductTitle
, [dbo].RemoveInvalidXMLCharacter(mb.ProductDescription) as ProductDescription
, CAST((
SELECT
Id,
[dbo].RemoveInvalidXMLCharacter(ProductDescription) ProductDescription
FROM ProductsManagement AS mpm
WHERE mpm.MattressId = 6
FOR XML PATH('ProductItemListModel'), ROOT('MattressBarndProductItemList'))as XML)
FROM Brands AS mb
WHERE mb.Id = 6
FOR XML PATH(''), ROOT('ProductModel')
Another method is an conversation to VARBINARY and back described also in this linked topic: Invalid Characters in XML
来源:https://stackoverflow.com/questions/54693134/xml-parsing-illegal-character-in-sql-server