Encoding SQL_Latin1_General_CP1_CI_AS into UTF-8

后端 未结 7 1899
小鲜肉
小鲜肉 2020-12-09 10:50

I\'m generating a XML file with PHP using DomDocument and I need to handle asian characters. I\'m pulling data from the MSSQL2008 server using the pdo_mssql driver and I app

相关标签:
7条回答
  • 2020-12-09 11:17

    You can try so:

    header("Content-Type: text/html; charset=utf-8");
    $dbhost   = "hostname";
    $db       = "database";
    $query = "SELECT *
        FROM Estado
        ORDER BY Nome";
    $conn = new PDO( "sqlsrv:server=$dbhost ; Database = $db", "", "" );
    $stmt = $conn->prepare( $query, array(PDO::ATTR_CURSOR => PDO::CURSOR_SCROLL, PDO::SQLSRV_ATTR_CURSOR_SCROLL_TYPE => PDO::SQLSRV_CURSOR_BUFFERED, PDO::SQLSRV_ENCODING_SYSTEM) );
    $stmt->execute();
    while ( $row = $stmt->fetch( PDO::FETCH_ASSOC ) )
    {
    // CP1252 == code page Latin1
    print iconv("CP1252", "ISO-8859-1", "$row[Nome] <br>");
    }
    
    0 讨论(0)
  • 2020-12-09 11:18

    Thanks @SGr for answer.
    I found out a better way for doing that :

    SELECT CAST(CAST(MY_COLUMN AS VARBINARY(MAX)) AS VARCHAR(MAX)) as MY_COLUMN FROM MY_TABLE;
    and also try with:
    SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) as MY_COLUMN FROM MY_TABLE;

    And in PHP you should just convert it to UTF-8 :

    $string = iconv('UCS-2LE', 'UTF-8', $row['MY_COLUMN']);

    0 讨论(0)
  • 2020-12-09 11:19

    By default, PDO uses PDO::SQLSRV_ENCODING_UTF8 for sending/receiving data.

    If your current collate is LATIN1, have you tried specifiying PDO::SQLSRV_ENCODING_SYSTEM to let PDO know that you want to use the current system encoding instead of UTF-8 ?

    You could even use PDO::SQLSRV_ENCODING_BINARY which returns data in a binary form (no encoding or translation is done when transfering data). This way, you could handle character encoding on your side.

    More documentation here: http://ca3.php.net/manual/en/ref.pdo-sqlsrv.php

    0 讨论(0)
  • 2020-12-09 11:29

    I found how to solve it, so hopefully this will be helpful to someone.

    First, SQL_Latin1_General_CP1_CI_AS is a strange mix of CP-1252 and UTF-8. The basic characters are CP-1252, so this is why all I had to do was UTF-8 and everything worked. The asian and other UTF-8 characters are encoded on 2 bytes and the php pdo_mssql driver seems to hate varying length characters so it seems to do a CAST to varchar (instead of nvarchar) and then all the 2 byte characters become question marks ('?').

    I fixed it by casting it to binary and then I rebuild the text with php:

    SELECT CAST(MY_COLUMN AS VARBINARY(MAX)) FROM MY_TABLE;
    

    In php:

    //Binary to hexadecimal
    $hex = bin2hex($bin);
    
    //And then from hex to string
    $str = "";
    for ($i=0;$i<strlen($hex) -1;$i+=2)
    {
        $str .= chr(hexdec($hex[$i].$hex[$i+1]));
    }
    //And then from UCS-2LE/SQL_Latin1_General_CP1_CI_AS (that's the column format in the DB) to UTF-8
    $str = iconv('UCS-2LE', 'UTF-8', $str);
    
    0 讨论(0)
  • 2020-12-09 11:32

    For me, none of the above was the direct solution--though I did use parts of above solutions. This worked for me with the Vietnamese alphabet. If you come across this post and none of the above work for you, try:

        $req = "SELECT CAST(MY_COLUMN as VARBINARY(MAX)) as MY_COLUMN FROM MY_TABLE"; 
        $stmt = $conn->prepare($req);
        $stmt->execute();
        while ($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
            $str = pack("H*",$row['MY_COLUMN']);
            $str = mb_convert_encoding($z, 'HTML-ENTITIES','UCS-2LE');
            print_r($str);
        }
    

    And a little bonus--I had to json_encode this data and was (duh) getting html code instead of the special characters. to fix just use html_entity_decode() on the strings before sending with json_encode.

    0 讨论(0)
  • I know this post is old, but the only thing that work for me was iconv("CP850", "UTF-8//TRANSLIT", $var); I had the same issues with SQL_Latin1_General_CP1_CI_AI, maybe it work for SQL_Latin1_General_CP1_CI_AS too.

    0 讨论(0)
提交回复
热议问题