Searching for non-ascii characters

后端未结

关注

 3  824

难免孤独 2021-01-24 10:24

I have a file, a.out, which contains a number of lines. Each line is one character only, either the unicode character U+2013 or a lower case letter a-z

3条回答

野性不改 (楼主)

2021-01-24 10:52
I recommend avoiding dodgy grep -P implementations and use the real thing. This works:
```
perl -CSD -nle 'print "$.: $_" if /\P{ASCII}/' utfile1 utfile2 utfile3 ...
```
Where:
- The -CSD options says that both the stdio trio (stdin, stdout, stderr) and disk files should be treated as UTF-8 encoded.
- The $. represents the current record (line) number.
- The $_ represents the current line.
- The \P{ASCII} matches any code point that is not ASCII.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...