How to remove non UTF-8 characters from text file

后端 未结 3 1819
青春惊慌失措
青春惊慌失措 2020-11-28 20:16

I have a bunch of Arabic, English, Russian files which are encoded in utf-8. Trying to process these files using a Perl script, I get this error:

Malformed U         


        
3条回答
  •  庸人自扰
    2020-11-28 20:39

    This command:

    iconv -f utf-8 -t utf-8 -c file.txt
    

    will clean up your UTF-8 file, skipping all the invalid characters.

    -f is the source format
    -t the target format
    -c skips any invalid sequence
    

提交回复
热议问题