Fast alternative to grep -f

前端 未结 8 1274
春和景丽
春和景丽 2020-12-11 16:00

file.contain.query.txt

ENST001

ENST002

ENST003

file.to.search.in.txt

ENST001  90

ENST002  80

ENST004  50
8条回答
  •  粉色の甜心
    2020-12-11 16:43

    This may be a little dated, but is tailor-made for simple UNIX utilities. Given:

    • keys are fixed-length (here 7 chars)
    • files are sorted (true in the example) allowing the use of fast merge sort

    Then:

    $ sort -m file.contain.query.txt file.to.search.in.txt | tac | uniq -d -w7
    
    ENST002  80
    
    ENST001  90
    

    Variants:

    To strip the number printed after the key, remove tac command:

    $ sort -m file.contain.query.txt file.to.search.in.txt | uniq -d -w7
    

    To keep sorted order, add an extra tac command at the end:

    $ sort -m file.contain.query.txt file.to.search.in.txt | tac | uniq -d -w7 | tac
    

提交回复
热议问题