How to remove non UTF-8 characters from text file

后端 未结 3 1817
青春惊慌失措
青春惊慌失措 2020-11-28 20:16

I have a bunch of Arabic, English, Russian files which are encoded in utf-8. Trying to process these files using a Perl script, I get this error:

Malformed U         


        
3条回答
  •  萌比男神i
    2020-11-28 20:22

    Your method must read byte by byte and fully understand and appreciate the byte wise construction of characters. The simplest method is to use an editor which will read anything but only output UTF-8 characters. Textpad is one choice.

提交回复
热议问题