How do I read UTF-8 with diamond operator (<>)?

前端 未结 4 1533
暖寄归人
暖寄归人 2020-12-04 10:23

I want to read UTF-8 input in Perl, no matter if it comes from the standard input or from a file, using the diamond operator: while(<>){...}.

So m

4条回答
  •  悲哀的现实
    2020-12-04 10:56

    Try to use the pragma open instead:

    use strict;
    use warnings;
    use open qw(:std :utf8);
    
    while(<>){
        my @chars = split //, $_;
        print "$_" foreach(@chars);
    }
    

    You need to do this because the <> operator is magical. As you know it will read from STDIN or from the files in @ARGV. Reading from STDIN causes no problem as STDIN is already open thus binmode works well on it. The problem is when reading from the files in @ARGV, when your script starts and calls binmode the files are not open. This causes STDIN to be set to UTF-8, but this IO channel is not used when @ARGV has files. In this case the <> operator opens a new file handle for each file in @ARGV. Each file handle gets reset and loses it's UTF-8 attribute. By using the pragma open you force each new STDIN to be in UTF-8.

提交回复
热议问题