Perl: matching data in two files

此生再无相见时 提交于 2019-12-25 04:48:08

问题


I would like to match and print data from two files (File1.txt and File2.txt). Currently, I'm trying to match the first letter of the second column in File1 to the first letter of the third column in File2.txt.

File1.txt
1  H  35
1  C  22
1  H  20

File2.txt
A  1 HB2 MET  1 
A  2 CA  MET  1
A  3 HA  MET  1

OUTPUT
1  MET  HB2  35
1  MET  CA   22
1  MET  HA   20 

Here is my script, I've tried following this submission: In Perl, mapping between a reference file and a series of files

#!/usr/bin/perl

use strict;
use warnings;

my %data;

open (SHIFTS,"file1.txt") or die;
open (PDB, "file2.txt") or die;

while (my $line = <PDB>) {
    chomp $line;
    my @fields = split(/\t/,$line);
    $data{$fields[4]} = $fields[2];
 }

 close PDB;

 while (my $line = <SHIFTS>) {
    chomp($line);
    my @columns = split(/\t/,$line);
    my $value = ($columns[1] =~ m/^.*?([A-Za-z])/ );
 }
    print "$columns[0]\t$fields[3]\t$value\t$data{$value}\n";

 close SHIFTS;
 exit;

回答1:


Here's one way using split() hackery:

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my $f1 = 'file1.txt';
my $f2 = 'file2.txt';

my @pdb;

open my $pdb_file, '<', $f2
  or die "Can't open the PDB file $f2: $!";

while (my $line = <$pdb_file>){
    chomp $line;
    push @pdb, $line; 
}

close $pdb_file;

open my $shifts_file, '<', $f1
  or die "Can't open the SHIFTS file $f1: $!";

while (my $line = <$shifts_file>){

    chomp $line;

    my $pdb_line = shift @pdb;

    # - inner split: get the third element from the $pdb_line
    # - outer split: get the first element (character) from the
    #   result of the inner split

    my $criteria = (split('', (split('\s+', $pdb_line))[2]))[0];

    # - compare the 2nd element of the file1.txt line against
    #   the above split() operations

    if ((split('\s+', $line))[1] eq $criteria){
        print "$pdb_line\n";
    }
    else {
        print "**** >$pdb_line< doesn't match >$line<\n";
    }
}

Files:

file1.txt (note I changed line two to ensure a non-match worked):

1  H  35
1  A  22
1  H  20

file2.txt:

A  1 HB2 MET  1 
A  2 CA  MET  1
A  3 HA  MET  1

Output:

./app.pl
A  1 HB2 MET  1 
****>A  2 CA  MET  1< doesn't match >1  A  22<
A  3 HA  MET  1


来源:https://stackoverflow.com/questions/30600286/perl-matching-data-in-two-files

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!