How to sort an array of UTF-8 strings?

后端 未结 8 780
刺人心
刺人心 2020-11-27 04:32

I currentyl have no clue on how to sort an array which contains UTF-8 encoded strings in PHP. The array comes from a LDAP server so sorting via a database (would be no probl

8条回答
  •  爱一瞬间的悲伤
    2020-11-27 04:53

    Using your example with codepage 1252 worked perfectly fine here on my windows development machine.

    $array=array('Birnen', 'Äpfel', 'Ungetüme', 'Apfel', 'Ungetiere', 'Österreich');
    $oldLocal=setlocale(LC_COLLATE, "0");
    var_dump(setlocale(LC_COLLATE, 'German_Germany.1252'));
    usort($array, 'strcoll');
    var_dump(setlocale(LC_COLLATE, $oldLocal));
    var_dump($array);
    

    ...snip...

    This was with PHP 5.2.6. btw.


    The above example is wrong, it uses ASCII encoding instead of UTF-8. I did trace the strcoll() calls and look what I found:

    function traceStrColl($a, $b) {
        $outValue = strcoll($a, $b);
        echo "$a $b $outValue\r\n";
        return $outValue;
    }
    
    $array=array('Birnen', 'Äpfel', 'Ungetüme', 'Apfel', 'Ungetiere', 'Österreich');
    setlocale(LC_COLLATE, 'German_Germany.65001');
    usort($array, 'traceStrColl');
    print_r($array);
    

    gives:

    Ungetüme Äpfel 2147483647
    Ungetüme Birnen 2147483647
    Ungetüme Apfel 2147483647
    Ungetüme Ungetiere 2147483647
    Österreich Ungetüme 2147483647
    Äpfel Ungetiere 2147483647
    Äpfel Birnen 2147483647
    Apfel Äpfel 2147483647
    Ungetiere Birnen 2147483647

    I did find some bug reports which have been flagged being bogus... The best bet you have is filing a bug-report I suppose though...

提交回复
热议问题