I would like to ask for some hints in how to merge rows that share unique IDs into a comma separated table. Any hints in Perl, sed or awk are greatly appreciated.
This i
Using a Perl hash of arrays...
#!/usr/bin/perl
use warnings;
use strict;
my %data;
my $header;
while(){
chomp;
if ($. == 1){
$header = $_;
next;
}
push @{ $data{(split)[0]} }, (split)[1];
}
print "$header\n";
for my $k (sort {$a<=>$b} keys %data){
print "$k\t";
print join(', ', @{ $data{$k} });
print "\n";
}
__DATA__
protein_id go_id
4102 GO:0003676
4125 GO:0003676
4125 GO:0008270
4139 GO:0008270