I\'m generating a XML file with PHP using DomDocument and I need to handle asian characters. I\'m pulling data from the MSSQL2008 server using the pdo_mssql driver and I app
I found how to solve it, so hopefully this will be helpful to someone.
First, SQL_Latin1_General_CP1_CI_AS is a strange mix of CP-1252 and UTF-8. The basic characters are CP-1252, so this is why all I had to do was UTF-8 and everything worked. The asian and other UTF-8 characters are encoded on 2 bytes and the php pdo_mssql driver seems to hate varying length characters so it seems to do a CAST to varchar (instead of nvarchar) and then all the 2 byte characters become question marks ('?').
I fixed it by casting it to binary and then I rebuild the text with php:
SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) FROM MY_TABLE;
In php:
//Binary to hexadecimal
$hex = bin2hex($bin);
//And then from hex to string
$str = "";
for ($i=0;$i