The default encoding for an XML type field defined in an SQL Server is UTF-16. I have no trouble inserting into that field with UTF-16 encoded XML streams.
But if I t
Is there a way to define a SQL Server column/field as having UTF-8 encoding?
No, the only Unicode encoding in SQL Server is UTF-16 Little Endian, which is how the NCHAR, NVARCHAR, NTEXT (deprecated as of SQL Server 2005 so don't use this in new development; besides, it sucks compared to NVARCHAR(MAX) anyway), and XML datatypes are handled. You do not get a choice of Unicode encodings like some other RDBMS's allow.
You can insert UTF-8 encoded XML into SQL Server, provided you follow these three rules:
VARCHAR, not NVARCHAR (as NVARCHAR is always UTF-16 Little Endian, hence the error about not being able to switch the encoding)..For example, we can import a UTF-8 encoded XML document containing the screaming face emoji (and we can get the UTF-8 byte sequence for that Supplementary Character by following that link):
SET NOCOUNT ON;
DECLARE @XML XML = ''
+ CHAR(0xF0) + CHAR(0x9F) + CHAR(0x98) + CHAR(0xB1)
+ ' ';
SELECT @XML;
PRINT CONVERT(NVARCHAR(MAX), @XML);
Returns (in both "Results" and "Messages" tabs):