file.contain.query.txt
ENST001
ENST002
ENST003
file.to.search.in.txt
ENST001 90
ENST002 80
ENST004 50
If you have fixed strings, use grep -F -f
. This is significantly faster than regex search.
If you are using perl version 5.10 or newer, you can join the 'query' terms into a regular expression with the query terms separated by the 'pipe'. (Like:ENST001|ENST002|ENST003
) Perl builds a 'trie' which, like a hash, does lookups in constant time. It should run as fast as the solution using a lookup hash. Just to show another way to do this.
#!/usr/bin/perl
use strict;
use warnings;
use Inline::Files;
my $query = join "|", map {chomp; $_} <QUERY>;
while (<RAW>) {
print if /^(?:$query)\s/;
}
__QUERY__
ENST001
ENST002
ENST003
__RAW__
ENST001 90
ENST002 80
ENST004 50