Replace unicode character

后端 未结 2 1754
猫巷女王i
猫巷女王i 2020-12-11 21:26

I am trying to replace a certain character in a string with another. They are quite obscure latin characters. I want to replace character (hex) 259 with 4d9, so I tried th

相关标签:
2条回答
  • 2020-12-11 22:12

    U+0259 Latin Small Letter Schwa is only encoded as the byte sequence 0x02,0x59 in the UTF-16BE encoding. It is very unlikely you will be working with byte strings in the UTF-16BE encoding as it's not an ASCII-compatible encoding and almost no-one uses it.

    The encoding you want to be working with (the only ASCII-superset encoding to support both Latin Schwa and Cyrillic Schwa, as it supports all Unicode characters) is UTF-8. Ensure your input is in UTF-8 format (if it is coming from form data, serve the page containing the form as UTF-8). Then, in UTF-8, the character U+0259 is represented using the byte sequence 0xC9,0x99.

    str_replace("\xC9\x99", "\xD3\x99", $string);
    

    If you make sure to save your .php file as UTF-8-no-BOM in the text editor, you can skip the escaping and just directly say:

    str_replace('ə', 'ә', $string);
    
    0 讨论(0)
  • 2020-12-11 22:15

    A couple of possible suggestions. Firstly, remember that you need to assign the new value to $string, i.e.:

    $string = str_replace("\x02\x59","\x04\xd9",$string);
    

    Secondly, verify that your byte stream occurs in the $string. I mention this because your hex string begins with a low-byte, so you'll need to make sure your $string is not UTF8 encoded.

    0 讨论(0)
提交回复
热议问题