SQL Server - defining an XML type column with UTF-8 encoding

后端 未结 4 1660
忘掉有多难
忘掉有多难 2020-11-30 15:29

The default encoding for an XML type field defined in an SQL Server is UTF-16. I have no trouble inserting into that field with UTF-16 encoded XML streams.

But if I t

4条回答
  •  情歌与酒
    2020-11-30 15:45

    The "Type Casting String and Binary Instances" section of the MSDN document

    Create Instances of XML Data

    explains how incoming XML data is interpreted. Essentially,

    • if the SQL Server receives the XML data as nvarchar then it "assumes a two-byte unicode encoding such as UTF-16 or UCS-2",

    • if the SQL Server receives the XML data as varchar then by default it will use the (single-byte character set) code page defined for the SQL Server instance,

    • if the SQL Server receives the XML data as varbinary then it "is treated as a codepoint stream that is passed directly to the XML parser", and "an instance without BOM and without a declaration encoding will be interpreted as UTF-8".

    If your marshalling code is spitting out a Java String to be sent to the SQL Server then it is very likely being sent as nvarchar since a Java String is always a Unicode string. That would explain why the SQL Server assumes UTF-16 encoding.

    If you really need to send the XML data to the SQL Server with UTF-8 encoding (though I can't imagine why) then your marshalling code probably needs to produce a stream of (UTF-8 encoded) bytes that will be sent to the SQL Server as varbinary.

提交回复
热议问题