How do I efficiently parse a CSV file in Perl?

后端 未结 6 1901
独厮守ぢ
独厮守ぢ 2020-11-27 19:19

I\'m working on a project that involves parsing a large csv formatted file in Perl and am looking to make things more efficient.

My approach has been to split(

6条回答
  •  执念已碎
    2020-11-27 20:18

    The right way to do it -- by an order of magnitude -- is to use Text::CSV_XS. It will be much faster and much more robust than anything you're likely to do on your own. If you're determined to use only core functionality, you have a couple of options depending on speed vs robustness.

    About the fastest you'll get for pure-Perl is to read the file line by line and then naively split the data:

    my $file = 'somefile.csv';
    my @data;
    open(my $fh, '<', $file) or die "Can't read file '$file' [$!]\n";
    while (my $line = <$fh>) {
        chomp $line;
        my @fields = split(/,/, $line);
        push @data, \@fields;
    }
    

    This will fail if any fields contain embedded commas. A more robust (but slower) approach would be to use Text::ParseWords. To do that, replace the split with this:

        my @fields = Text::ParseWords::parse_line(',', 0, $line);
    

提交回复
热议问题