Let\'s say I have an array, @theArr, which holds 1,000 or so elements such as the following:
01 \'12 16 sj.1012804p1012831.93.gz\'
02 \'12 16 sj.1012832p10
You can use a regex to pull the number out of every line inside the block you pass to the sort function:
@newArray = sort { my ($anum,$bnum); $a =~ /sj\.([0-9]+)p/; $anum = $1; $b =~ /sj\.(\d+)p/; $bnum = $1; $anum <=> $bnum } @theArr;
However, Chas. Owens's solution is better, since it only does the regex matches once for every element.
Here's an example that sorts them ascending, assuming you don't care too much about efficiency:
use strict;
my @theArr = split(/\n/, <<END_SAMPLE);
12 16 sj.1012804p1012831.93.gz
12 16 sj.1012832p1012859.94.gz
12 16 sj.1012860p1012887.95.gz
12 16 sj.1012888p1012915.96.gz
12 16 sj.1012916p1012943.97.gz
12 16 sj.875352p875407.01.gz
12 16 sj.875408p875435.02.gz
12 16 sj.875436p875535.03.gz
12 16 sj.875536p875575.04.gz
12 16 sj.875576p875603.05.gz
END_SAMPLE
my @sortedArr = sort compareBySJ @theArr;
print "Before:\n".join("\n", @theArr)."\n";
print "After:\n".join("\n", @sortedArr)."\n";
sub compareBySJ {
# Capture the values to compare, against the expected format
# NOTE: This could be inefficient for large, unsorted arrays
# since you'll be matching the same strings repeatedly
my ($aVal) = $a =~ /^\d+\s+\d+\s+sj\.(\d+)p/
or die "Couldn't match against value $a";
my ($bVal) = $b =~ /^\d+\s+\d+\s+sj\.(\d+)p/
or die "Couldn't match against value $a";
# Return the numerical comparison of the values (ascending order)
return $aVal <=> $bVal;
}
Outputs:
Before:
12 16 sj.1012804p1012831.93.gz
12 16 sj.1012832p1012859.94.gz
12 16 sj.1012860p1012887.95.gz
12 16 sj.1012888p1012915.96.gz
12 16 sj.1012916p1012943.97.gz
12 16 sj.875352p875407.01.gz
12 16 sj.875408p875435.02.gz
12 16 sj.875436p875535.03.gz
12 16 sj.875536p875575.04.gz
12 16 sj.875576p875603.05.gz
After:
12 16 sj.875352p875407.01.gz
12 16 sj.875408p875435.02.gz
12 16 sj.875436p875535.03.gz
12 16 sj.875536p875575.04.gz
12 16 sj.875576p875603.05.gz
12 16 sj.1012804p1012831.93.gz
12 16 sj.1012832p1012859.94.gz
12 16 sj.1012860p1012887.95.gz
12 16 sj.1012888p1012915.96.gz
12 16 sj.1012916p1012943.97.gz
Yes. The sort function takes an optional comparison function which will be used to compare two elements. It can take the form of either a block of code, or the name of a function to call.
There is an example at the linked document that is similar to what you want to do:
# inefficiently sort by descending numeric compare using
# the first integer after the first = sign, or the
# whole record case-insensitively otherwise
@new = sort {
($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0]
||
uc($a) cmp uc($b)
} @old;
Looks like you need a Schwartzian Transform:
#!/usr/bin/perl
use strict;
use warnings;
my @a = <DATA>;
print
map { $_->[1] } #get the original value back
sort { $a->[0] <=> $b->[0] } #sort arrayrefs numerically on the sort value
map { /sj\.(.*?)p/; [$1, $_] } #build arrayref of the sort value and orig
@a;
__DATA__
12 16 sj.1012804p1012831.93.gz
12 16 sj.1012832p1012859.94.gz
12 16 sj.1012860p1012887.95.gz
12 16 sj.1012888p1012915.96.gz
12 16 sj.1012916p1012943.97.gz
12 16 sj.875352p875407.01.gz
12 16 sj.875408p875435.02.gz
12 16 sj.875436p875535.03.gz
12 16 sj.875536p875575.04.gz
12 16 sj.875576p875603.05.gz
12 16 sj.875604p875631.06.gz
12 16 sj.875632p875659.07.gz
12 16 sj.875660p875687.08.gz
12 16 sj.875688p875715.09.gz
12 16 sj.875716p875743.10.gz