Best practices in PHP and MySQL with international strings

前端 未结 5 856
一个人的身影
一个人的身影 2020-12-08 03:22

It often happens that characters such as é gets transformed to é, even though the collation for the MySQL DB, table and field is set to utf8_general_ci. T

5条回答
  •  时光取名叫无心
    2020-12-08 04:01

    Regardless of the language it's written in, if you were to create an app that allows a wide array of encodings, handle it in pieces:

    • Identify the encoding
      • somehow you want to find out what kind of encoding you're dealing with, otherwise, it's pretty pointless to consider it further. You'll end up with junk chars.
    • Handle your bytes
      • think of these strings less like 'strings' of characters, and more like lists of bytes
      • PHP is especially sneaky. Don't let it truncate your data on-the-fly. If you're regexing a UTF-8 string, make sure you identify it as such
    • Store for the LCD
      • Again, you don't want to truncate data. If you're storing a sentence in English, can you also store a set of Mandarin glyphps? How about Arabic? Which of these is going to require the most space? Account for it.

提交回复
热议问题