I would like your help on trimming a file by removing the columns with the same value.
# the file I have (tab-delimited, millions of columns)
jack 1 5 9
joh
#!/usr/bin/perl
$/="\t";
open(R,"<","/tmp/filename") || die;
while ()
{
next if (($. % 4) == 3);
print;
}
Well, this was assuming it was the third column. If it is by value:
#!/usr/bin/perl
$/="\t";
open(R,"<","/tmp/filename") || die;
while ()
{
next if (($_ == 5);
print;
}
With the question edit, OP's desires become clear. How about:
#!/usr/bin/perl
open(R,"<","/tmp/filename") || die;
my $first = 1;
my (@cols);
while ()
{
my (@this) = split(/\t/);
if ($. == 1)
{
@cols = @this;
}
else
{
for(my $x=0;$x<=$#cols;$x++)
{
if (defined($cols[$x]) && !($cols[$x] ~~ $this[$x]))
{
$cols[$x] = undef;
}
}
}
next if (($_ == 5));
# print;
}
close(R);
my(@del);
print "Deleting columns: ";
for(my $x=0;$x<=$#cols;$x++)
{
if (defined($cols[$x]))
{
print "$x ($cols[$x]), ";
push(@del,$x-int(@del));
}
}
print "\n";
open(R,"<","/tmp/filename") || die;
while ()
{
chomp;
my (@this) = split(/\t/);
foreach my $col (@del)
{
splice(@this,$col,1);
}
print join("\t",@this)."\n";
}
close(R);