MongoDB PHP UTF-8 problems

前端 未结 3 1602
暖寄归人
暖寄归人 2020-12-03 03:44

Assume that I need to insert the following document:

{
    title: \'Péter\'
}

(note the é)

It gives me an error when I use the foll

3条回答
  •  悲哀的现实
    2020-12-03 04:08

    JSON and BSON can only encode / decode valid UTF-8 strings, if your data (included input) is not UTF-8 you need to convert it before passing it to any JSON dependent system, like this:

    $string = iconv('UTF-8', 'UTF-8//IGNORE', $string); // or
    $string = iconv('UTF-8', 'UTF-8//TRANSLIT', $string); // or even
    $string = iconv('UTF-8', 'UTF-8//TRANSLIT//IGNORE', $string); // not sure how this behaves
    

    Personally I prefer the first option, see the iconv() manual page. Other alternatives include:

    • mb_convert_encoding()
    • utf8_encode(utf8_decode($string))

    You should always make sure your strings are UTF-8 encoded, even the user-submitted ones, however since you mentioned that you're migrating from MySQL to MongoDB, have you tried exporting your current database to CSV and using the import scripts that come with Mongo? They should handle this...


    EDIT: I mentioned that BSON can only handle UTF-8, but I'm not sure if this is exactly true, I have a vague idea that BSON uses UTF-16 or UTF-32 to encode / decode data, but I can't check now.

提交回复
热议问题