I have a file with a list, and a need to make a file that compares each line to the other. for example, my file has this:
AAA BBB CCC DDD EEE
I w
Take a look at Math::Combinatorics - Perform combinations and permutations on lists
example copying from the CPAN:
use Math::Combinatorics;
my @n = qw(a b c);
my $combinat = Math::Combinatorics->new(count => 2,
data => [@n],
);
print "combinations of 2 from: ".join(" ",@n)."\n";
print "------------------------".("--" x scalar(@n))."\n";
while(my @combo = $combinat->next_combination){
print join(' ', @combo)."\n";
}
print "\n";
print "permutations of 3 from: ".join(" ",@n)."\n";
print "------------------------".("--" x scalar(@n))."\n";
while(my @permu = $combinat->next_permutation){
print join(' ', @permu)."\n";
}
output:
combinations of 2 from: a b c
------------------------------
a b
a c
b c
permutations of 3 from: a b c
------------------------------
a b c
a c b
b a c
b c a
c a b
c b a
How about:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump qw(dump);
my @in = qw(AAA BBB CCC DDD EEE);
my @list;
while(my $first = shift @in) {
last unless @in;
my $rest = join',',@in;
push @list, glob("{$first}{$rest}");
}
dump @list;
output:
(
"AAABBB",
"AAACCC",
"AAADDD",
"AAAEEE",
"BBBCCC",
"BBBDDD",
"BBBEEE",
"CCCDDD",
"CCCEEE",
"DDDEEE",
)
I benchmarked the following Perl modules:
Benchmark consisted of doing what the OP asked, combinations of 2 items, but ramping the set of words up to 10,000 instead of just the original 5 requested (AAA BBB CCC DDD EEE).
#!/usr/bin/env perl
use strict; use warnings;
use Math::Combinatorics;
my $strings = [qw(AAA BBB CCC DDD EEE) x 2000];
my $iter = new Math::Combinatorics (count => 2, data => $strings);
while (my @c = $iter->next_combination) {
print "@c\n";
}
This produced ~53,479 combinations per-second.
#!/usr/bin/env perl
use strict; use warnings;
use Algorithm::Combinatorics qw(combinations);
my $strings = [qw(AAA BBB CCC DDD EEE) x 2000];
my $iter = combinations($strings, 2);
while (my $c = $iter->next) {
print "@$c\n";
}
This produced ~861,982 combinations per-second.
#!/usr/bin/env perl
use strict; use warnings;
use Cmb;
my $strings = [qw(AAA BBB CCC DDD EEE) x 2000];
my $cmb = new Cmb { size_min => 2, size_max => 2 };
$cmb->cmb_callback($#$strings + 1, $strings, sub {
print "@_\n";
return 0;
});
This produced ~2,940,882 combinations per-second.
But if you just need to print the combinations, Cmb can actually do that even faster than the above.
#!/usr/bin/env perl
use strict; use warnings;
use Cmb;
my $strings = [qw(AAA BBB CCC DDD EEE) x 2000];
my $cmb = new Cmb { size_min => 2, size_max => 2 };
$cmb->cmb($#$strings + 1, $strings);
This produced ~3,333,000 combinations per-second.
Benchmarks were performed using dpv on CentOS Linux release 7.7.1908 (Core) under kernel 3.10.0-1062.1.1.el7.x86_64 x86_64 using Perl 5.16.3 on an Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Here's a hack using glob:
my @list = qw(AAA BBB CCC DDD EEE);
for my $i (0..$#list-1) {
print join "\n", glob sprintf "{'$list[$i] '}{%s}",
join ",", @list[$i+1..$#list];
print "\n";
}
The output:
AAA BBB
AAA CCC
AAA DDD
AAA EEE
BBB CCC
BBB DDD
BBB EEE
CCC DDD
CCC EEE
DDD EEE
P.S. you may want to use Text::Glob::Expand or String::Glob::Permute modules instead of plain glob()
to avoid the caveat of matching files in the current working directory.
It is straightforward to write this using recursion.
This code example demonstrates.
use strict;
use warnings;
my $strings = [qw(AAA BBB CCC DDD EEE)];
sub combine;
print "@$_\n" for combine $strings, 5;
sub combine {
my ($list, $n) = @_;
die "Insufficient list members" if $n > @$list;
return map [$_], @$list if $n <= 1;
my @comb;
for my $i (0 .. $#$list) {
my @rest = @$list;
my $val = splice @rest, $i, 1;
push @comb, [$val, @$_] for combine \@rest, $n-1;
}
return @comb;
}
Edit
My apologies - I was generating permutations instead of combinations.
This code is correct.
use strict;
use warnings;
my $strings = [qw(AAA BBB CCC DDD EEE)];
sub combine;
print "@$_\n" for combine $strings, 2;
sub combine {
my ($list, $n) = @_;
die "Insufficient list members" if $n > @$list;
return map [$_], @$list if $n <= 1;
my @comb;
for (my $i = 0; $i+$n <= @$list; ++$i) {
my $val = $list->[$i];
my @rest = @$list[$i+1..$#$list];
push @comb, [$val, @$_] for combine \@rest, $n-1;
}
return @comb;
}
output
AAA BBB
AAA CCC
AAA DDD
AAA EEE
BBB CCC
BBB DDD
BBB EEE
CCC DDD
CCC EEE
DDD EEE