“Illegal Byte sequence” error while using shell commands in mac bash terminal

十年热恋 提交于 2019-12-06 03:42:48

问题


Getting "illegal byte sequence" error while trying to extract non English characters from a large file in MacOS bash shell. This is the script that I am trying to use:

sed 's/[][a-z,0-9,A-Z,!@#\$%^&*(){}":/_-|. -][\;''=?]*//g' < $1 >Abhineet_extract1.txt;
sed 's/\(.\)/\1\
/g' <Abhineet_extract1.txt | sort | uniq |tr -d '\n' >&1;
rm Abhineet_extract1.txt;

and here is the error that I am getting:

uniq: stdin: Illegal byte sequence

'+?


回答1:


It seems that a UTF-8 locale is causing Illegal byte sequence.

Instead say:

LC_CTYPE=C your_command

man locale says:

   These environment variables affect each locale categories for all
   locale-aware programs:

   LC_CTYPE

           Character classification and case conversion.


来源:https://stackoverflow.com/questions/18953667/illegal-byte-sequence-error-while-using-shell-commands-in-mac-bash-terminal

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!