SOLVED: Hash content access is inconsistent with different perl version

喜你入骨 提交于 2020-03-23 08:19:47

问题


I came across an interesting problem with following piece of code in perl 5.22.1 and perl 5.30.0

use strict;
use warnings;
use feature 'say';

#use Data::Dumper;

my %hash;
my %seen;
my @header = split ',', <DATA>;

chomp @header;

while(<DATA>) {
    next if /^\s*$/;
    chomp;
    my %data;
    @data{@header} = split ',';

    push @{$hash{person}}, \%data;
    push @{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
    if( ! $seen{$data{Position}} ) {
        $seen{$data{Position}} = 1;
        push @{$hash{Role}}, $data{Position};
    }
}

#say Dumper($hash{Position});

my $count = 0;
for my $person ( @{$hash{person}} ) {
    say "Person: $count";
    say "Role: $person->{Position}";
}

say "---- Groups ----\n";

while( my($p,$m) = each %{$hash{Position}} ) {
    say "-> $p";
    my $members = join(',',@{$m});
    say "-> Members: $members\n";
}

say "---- Roles ----";

say '-> ' . join(', ',@{$hash{Role}});

__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer

If the code run as it is -- everything works fine

Now it is sufficient to add $count++ increment as bellow and code produces errors

my $count = 0;
for my $person ( @{$hash{person}} ) {
    $count++;
    say "Person: $count";
    say "Role: $person->{Position}";
}

Errors:

Error(s), warning(s):
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 24, <DATA> line 2.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 3.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 22, <DATA> line 4.
Use of uninitialized value $data{"Position"} in hash element at source_file.pl line 23, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in concatenation (.) or string at source_file.pl line 35, <DATA> line 4.
Use of uninitialized value in join or string at source_file.pl line 48, <DATA> line 4.

This problem does not manifest itself in perl 5.30.0 (Windows 10, Strawberry Perl) or Perl v5.24.2.

Note: the problem manifests itself not only with $count++ but with any other access to content of the hash next to say "Person: $count"; -- post# 60653651

I would like to hear comments on this situation, what is the cause?

CAUSE: input data have eol in DOS form \r\n and when data processed in Linux chomp removes only \n leaving \r as part of the field name (used as hash key). Thanks goes to Shawn for pointing out the source of the issue.

SOLUTION: universal fix was implemented in form of snip_eol($arg) subroutine

use strict;
use warnings;
use feature 'say';

my $debug = 0;

say "
Perl:  $^V
OS: $^O
-------------------
" if $debug;

my %hash;
my %seen;
my @header = split ',', <DATA>;

$header[2] = snip_eol($header[2]);        # problem fix

while(<DATA>) {
    next if /^\s*$/;

    my $line = snip_eol($_);              # problem fix

    my %data;
    @data{@header} = split ',',$line;

    push @{$hash{person}}, \%data;
    push @{$hash{Position}{$data{Position}}}, "$data{First} $data{Last}";
    if( ! $seen{$data{Position}} ) {
        $seen{$data{Position}} = 1;
        push @{$hash{Role}}, $data{Position};
    }
}

#say Dumper($hash{Position});

my $count = 0;
for my $person ( @{$hash{person}} ) {
    $count++;
    say "-> Name:   $person->{First} $person->{Last}";
    say "-> Role:   $person->{Position}\n";
}

say "---- Groups ----\n";

while( my($p,$m) = each %{$hash{Position}} ) {
    say "-> $p";
    my $members = join(',',@{$m});
    say "-> Members: $members\n";
}

say "---- Roles ----";

say '-> ' . join(', ',@{$hash{Role}});

sub snip_eol {
    my $data = shift;                      # problem fix

    #map{ say "$_ => " . ord } split '', $data if $debug;
    $data =~ s/\r// if $^O eq 'linux';
    chomp $data;
    #map{ say "$_ => " . ord } split '', $data if $debug;

    return $data;
}

__DATA__
First,Last,Position
John,Doe,Developer
Mary,Fox,Manager
Anna,Gulaby,Developer

回答1:


I can replicate this behavior by (On linux) first converting the source file to have Windows-style \r\n line endings and then trying to run it. I thus suspect that in your testing of various versions you're using Windows sometimes, and a Linux/Unix other times, and not converting the file's line endings appropriately.

@chomp only removes a newline character (Well, the current value of $/ to be pedantic), so when used on a string with a Windows style line ending in it, it leaves the carriage return. The hash key is not "Position", it's "Position\r", which is not what the rest of your code uses.



来源:https://stackoverflow.com/questions/60689443/solved-hash-content-access-is-inconsistent-with-different-perl-version

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!