Output XML Files with encoding UTF-8 using SQL Server

前端 未结 2 745
逝去的感伤
逝去的感伤 2021-01-22 08:15

I have a query that generates XML files and loads them to FTP with .

I need to switch encoding to UTF-8 as follows:

         


        
2条回答
  •  情书的邮戳
    2021-01-22 08:50

    There are some things to know:

    • SQL Server does not support export via BCP to UTF-8 before version 2016 (and 2014 with SP2).
    • One cannot add the xml-declaration () to a native SQL-Server XML typed variable or column. This will either fail ("...switch the encoding") or the xml-declaration will disappear.
    • You can add the xml-declaration on string level to an xml casted to NVARCHAR(MAX). But you cannot re-cast (re-convert) this to an XML without failing or losing the declaration.
    • Internally SQL-Server keeps any XML as UCS-2 (very close to UTF-16) in any case.
    • SQL-Servers VARCHAR (CHAR) type is not utf-8 but extended ASCII (depending on a COLLATION)
    • on string level you can write literally anything into the xml-declaration (as you can creat something, which looks like XML, but is not well-formed. This is just an unchecked string.
    • The encoding stated in the xml-declaration is important only to mark the actual file encoding when written to a disk or when handled as byte stream.
    • You can write encoding="x" and store the file with an encoding of y - but you shouldn't.
    • SQL-Server will cast a string with an utf-8 declaration to XML when it is VARCHAR and it will cast a string with utf-16 when it is NVARCHAR, but you cannot cross this (Read this related answer). Other encodings very likely lead to cannot switch the encoding error.

    About your code

    • You should change @SQLStr and @cmd to NVARCHAR(MAX), othewise you might get in troubles with non-plain-latin characters.
    • As you are using a CURSOR, you should fill an XML-typed variable with the result of your statement, cast this to NVARCHAR(MAX) and add the declaration to this string. Do not cast the result back to XML.
    • Read the BCP docs. Stating -w will write unicode (wide), which is not utf-8 (what you write into the declaration has no effect here).

    Hint:

    Read this related answer, showing utf-8 export with BCP on SQL-Server 2016

提交回复
热议问题