Output XML Files with encoding UTF-8 using SQL Server

蹲街弑〆低调 提交于 2019-12-02 02:49:07

There are some things to know:

  • SQL Server does not support export via BCP to UTF-8 before version 2016 (and 2014 with SP2).
  • One cannot add the xml-declaration (<?xml blah ?>) to a native SQL-Server XML typed variable or column. This will either fail ("...switch the encoding") or the xml-declaration will disappear.
  • You can add the xml-declaration on string level to an xml casted to NVARCHAR(MAX). But you cannot re-cast (re-convert) this to an XML without failing or losing the declaration.
  • Internally SQL-Server keeps any XML as UCS-2 (very close to UTF-16) in any case.
  • SQL-Servers VARCHAR (CHAR) type is not utf-8 but extended ASCII (depending on a COLLATION)
  • on string level you can write literally anything into the xml-declaration (as you can creat something, which looks like XML, but is not well-formed. This is just an unchecked string.
  • The encoding stated in the xml-declaration is important only to mark the actual file encoding when written to a disk or when handled as byte stream.
  • You can write encoding="x" and store the file with an encoding of y - but you shouldn't.
  • SQL-Server will cast a string with an utf-8 declaration to XML when it is VARCHAR and it will cast a string with utf-16 when it is NVARCHAR, but you cannot cross this (Read this related answer). Other encodings very likely lead to cannot switch the encoding error.

About your code

  • You should change @SQLStr and @cmd to NVARCHAR(MAX), othewise you might get in troubles with non-plain-latin characters.
  • As you are using a CURSOR, you should fill an XML-typed variable with the result of your statement, cast this to NVARCHAR(MAX) and add the declaration to this string. Do not cast the result back to XML.
  • Read the BCP docs. Stating -w will write unicode (wide), which is not utf-8 (what you write into the declaration has no effect here).

Hint:

Read this related answer, showing utf-8 export with BCP on SQL-Server 2016

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO


CREATE PROCEDURE [dbo].[MyXMLTest]
@FileDestinationDir VARCHAR(2000)

-- to call procedure specify your own file path 
-- EXEC [Audit_DBA].[dbo].[MyXMLTest] 'E:\NLP\GovwinIQ_Ontology\NewFolder'

AS 

SET QUOTED_IDENTIFIER ON

IF OBJECT_ID (N'InputTemp.dbo.XMLTest', N'U') IS NOT NULL
DROP TABLE InputTemp.dbo.XMLTest;

CREATE TABLE InputTemp.dbo.XMLTest

(
[Id] INT NOT NULL,
[FirstName] VARCHAR(100) NOT NULL,
[LastName] VARCHAR(100) NOT NULL,
[Address] VARCHAR(100) NOT NULL
);

INSERT INTO InputTemp.dbo.XMLTest ([Id], [FirstName], [LastName], [Address])
VALUES (12, 'Zhuk', 'Termik', '123 Gam Str, Boston, NY');

--SELECT * FROM InputTemp.dbo.XMLTest

DECLARE @FilePath VARCHAR(4000)

DECLARE @SQLStr NVARCHAR(4000),
        @Cmd NVARCHAR(4000),
        @Ret INT

DECLARE @Id INT;

SELECT @Id = 12;

SELECT @SQLStr = 
'SELECT N''<?xml version=''''1.0'''' encoding=''''UTF-8''''?>'' + (SELECT CAST((SELECT [Id], [FirstName], [LastName], [Address] FROM InputTemp.dbo.XMLTest AS Body WHERE Id = '''  + str(@Id) + ''' FOR XML AUTO, ELEMENTS) AS NVARCHAR(MAX)))'

SELECT @SQLStr AS SQLStr

SELECT @FilePath = @FileDestinationDir+'\NewFolder'+ltrim(rtrim(str(@Id)))+'.xml' 

SELECT @Cmd = ' bcp " ' + @SQLStr + '" queryout '+@FilePath+' -c  -C65001 -r "" -T -S ' +@@ServerName 

EXEC @Ret = master.dbo.xp_cmdshell @Cmd 

IF OBJECT_ID (N'InputTemp.dbo.XMLTest', N'U') IS NOT NULL
DROP TABLE InputTemp.dbo.XMLTest;

GO
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!