Removing extra commas from csv file in perl

前端 未结 2 753
庸人自扰
庸人自扰 2020-12-21 23:28

I have a multiple CSV files each with a different amount of entries each with roughly 300 lines each.

The first line in each file is the Data labels



        
2条回答
  •  既然无缘
    2020-12-22 00:24

    A good CSV parser will have no trouble with this since commas are inside the quoted fields, so you can simply parse the file with it.

    A really nice module is Text::CSV_XS, which is loaded by default when you use the wrapper Text::CSV. The only thing to address in your data is the spaces between fields since they aren't in CSV specs, so I use the option for that in the example below.

    If you indeed must remove commas for further work do that as the parser hands you lines.

    use warnings;
    use strict;
    use feature 'say';
    
    use Text::CSV;
    
    my $file = 'commas_in_fields.csv';
    
    my $csv = Text::CSV->new( { binary => 1, allow_whitespace => 1 } ) 
        or die "Cannot use CSV: " . Text::CSV->error_diag (); 
    
    open my $fh, '<', $file or die "Can't open $file: $!";
    
    my @headers = @{ $csv->getline($fh) };   # if there is a separate header line
    
    while (my $line = $csv->getline($fh)) {  # returns arrayref
        tr/,//d for @$line;                  # delete commas from each field
        say "@$line";
    }
    

    This uses tr on $_ in the for loop, changing the elements of the array, for conciseness.


    I'd like to repeat and emphasize what others have explained: do not parse CSV by hand, since only trouble awaits; use a library. This is very much akin to parsing XML and similar formats: no regex please, but libraries.

提交回复
热议问题