I\'m sure I\'ve done this in the past and there is something small I\'m forgetting, but how can I sort a CSV file on a certain column? I\'m interested in answers with and wi
using Raku (née Perl6)
This is a fairly quick-and-dirty solution, mainly intended for "hand-rolled" CSV. The code works as long as there's only one (1) age-per-row: Read lines $a, comb for 1-to-3 surrounded by commas and assign to @b, derive sorting index $c, use $c to reorder lines $a:
~$ raku -e 'my $a=lines(); my @b=$a.comb(/ \, <(\d**1..3)> \, /).pairs; my $c=@b.sort(*.values)>>.keys.flat; $a[$c.flat]>>.put;' sort_age.txt
name,21,male
name,24,male
name,25,female
name,27,female
I prepended a few dummy lines to the OP's input file see how the code above reacts with 1). a blank age field, 2). a blank "" string for age, 3). a bogus "9999" for age, and 4). a bogus "NA" for age. The code above fails catastrophically. To fix this you have to write a ternary that inserts a numeric placeholder value (e.g. zero) whenever the regex fails to match a line.
Below is a longer but more robust solution. Note--I use a placeholder value of 999 to move lines with blank/invalid ages to the bottom:
~$ raku -e 'my @a=lines(); my @b = do for @a {if $_ ~~ m/ \, <(\d**1..3)> \, / -> { +$/ } else { 999 }; }; my $c=@b.pairs.sort(*.values)>>.keys.flat; @a[$c.flat]>>.put;' sort_age.txt
name,21,male
name,24,male
name,25,female
name,27,female
name,,male
name,"",female
name,9999,male
name,NA,male
To sort in reverse, add .reverse to the end of the method chain that creates $c. Again, change the else placeholder argument to move lines absent a valid age to the top or to the bottom. Also, creation of @b above can be written using the ternary operator: my @b = do for @a {(m/ \, <(\d**1..3)> \, /) ?? +$/ !! 999 };, as an alternative.
Here's the unsorted input file for posterity:
$ cat sort_age.txt
name,,male
name,"",female
name,9999,male
name,NA,male
name,25,female
name,24,male
name,27,female
name,21,male
HTH.
https://raku.org/