Sort CSV based on a certain column?

前端 未结 7 947
自闭症患者
自闭症患者 2020-12-18 09:50

I\'m sure I\'ve done this in the past and there is something small I\'m forgetting, but how can I sort a CSV file on a certain column? I\'m interested in answers with and wi

7条回答
  •  梦毁少年i
    2020-12-18 10:36

    using Raku (née Perl6)

    This is a fairly quick-and-dirty solution, mainly intended for "hand-rolled" CSV. The code works as long as there's only one (1) age-per-row: Read lines $a, comb for 1-to-3 surrounded by commas and assign to @b, derive sorting index $c, use $c to reorder lines $a:

    ~$ raku -e 'my $a=lines();  my @b=$a.comb(/ \, <(\d**1..3)> \, /).pairs;  my $c=@b.sort(*.values)>>.keys.flat;  $a[$c.flat]>>.put;' sort_age.txt
    name,21,male
    name,24,male
    name,25,female
    name,27,female
    

    I prepended a few dummy lines to the OP's input file see how the code above reacts with 1). a blank age field, 2). a blank "" string for age, 3). a bogus "9999" for age, and 4). a bogus "NA" for age. The code above fails catastrophically. To fix this you have to write a ternary that inserts a numeric placeholder value (e.g. zero) whenever the regex fails to match a line.

    Below is a longer but more robust solution. Note--I use a placeholder value of 999 to move lines with blank/invalid ages to the bottom:

    ~$ raku -e 'my @a=lines(); my @b = do for @a {if $_ ~~ m/ \, <(\d**1..3)> \, / -> { +$/ } else { 999 }; }; my $c=@b.pairs.sort(*.values)>>.keys.flat;  @a[$c.flat]>>.put;' sort_age.txt
    name,21,male
    name,24,male
    name,25,female
    name,27,female
    name,,male
    name,"",female
    name,9999,male
    name,NA,male
    

    To sort in reverse, add .reverse to the end of the method chain that creates $c. Again, change the else placeholder argument to move lines absent a valid age to the top or to the bottom. Also, creation of @b above can be written using the ternary operator: my @b = do for @a {(m/ \, <(\d**1..3)> \, /) ?? +$/ !! 999 };, as an alternative.

    Here's the unsorted input file for posterity:

    $ cat sort_age.txt
    name,,male
    name,"",female
    name,9999,male
    name,NA,male
    name,25,female
    name,24,male
    name,27,female
    name,21,male
    

    HTH.

    https://raku.org/

提交回复
热议问题