Parsing a CSV file using gawk

后端 未结 9 1255
感动是毒
感动是毒 2020-11-29 12:12

How do you parse a CSV file using gawk? Simply setting FS=\",\" is not enough, as a quoted field with a comma inside will be treated as multiple fields.

<
9条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-29 12:54

    Perl has the Text::CSV_XS module which is purpose-built to handle the quoted-comma weirdness.
    Alternately try the Text::CSV module.

    perl -MText::CSV_XS -ne 'BEGIN{$csv=Text::CSV_XS->new()} if($csv->parse($_)){@f=$csv->fields();for $n (0..$#f) {print "field #$n: $f[$n]\n"};print "---\n"}' file.csv

    Produces this output:

    field #0: one
    field #1: two
    field #2: three, four
    field #3: five
    ---
    field #0: six, seven
    field #1: eight
    field #2: nine
    ---
    

    Here's a human-readable version.
    Save it as parsecsv, chmod +x, and run it as "parsecsv file.csv"

    #!/usr/bin/perl
    use warnings;
    use strict;
    use Text::CSV_XS;
    my $csv = Text::CSV_XS->new();
    open(my $data, '<', $ARGV[0]) or die "Could not open '$ARGV[0]' $!\n";
    while (my $line = <$data>) {
        if ($csv->parse($line)) {
            my @f = $csv->fields();
            for my $n (0..$#f) {
                print "field #$n: $f[$n]\n";
            }
            print "---\n";
        }
    }
    

    You may need to point to a different version of perl on your machine, since the Text::CSV_XS module may not be installed on your default version of perl.

    Can't locate Text/CSV_XS.pm in @INC (@INC contains: /home/gnu/lib/perl5/5.6.1/i686-linux /home/gnu/lib/perl5/5.6.1 /home/gnu/lib/perl5/site_perl/5.6.1/i686-linux /home/gnu/lib/perl5/site_perl/5.6.1 /home/gnu/lib/perl5/site_perl .).
    BEGIN failed--compilation aborted.
    

    If none of your versions of Perl have Text::CSV_XS installed, you'll need to:
    sudo apt-get install cpanminus
    sudo cpanm Text::CSV_XS

提交回复
热议问题