Find patterns of a file in another file and print out a corresponding field of the latter maintaining the order

自作多情 提交于 2019-12-20 04:16:34

问题


I've been trying for a while to solve this problem and I checked many posts (for example here Print lines in one file matching patterns in another file or here awk search for a field in another file) without really finding what I am looking for. I need the solution with bash tools like sed, grep, awk (no python, R,...)

I have two files (much bigger than those):

file1:

   2   891299  0.50923964E-02     1248   4.713       1349.08
   3   245857  0.57915542E-02     1335   4.671       1369.65

file2:

   278    2645  2334659  0.75142      0.53123
   279    2643   245857  0.80439      0.56868
   500    1341   830677  0.74922      0.52958
   501    1339   882791  0.87685      0.61980
   502    1337   891299  0.63224      0.44680

In this example I want to find the pattern in column 2 of file1 in column 3 of file2 and print column 1 of the latter, for all the lines of file1 and maintaining the order given by file1.

A possible solution (I am aware is not bug free) is the following unacceptably slow bash loop:

for i in `awk '{print $2}' file1` ; do grep " $i " file2 | awk '{print $1}' ; done

which prints to screen:

502

279

Please note that a 'simple' solution like:

awk 'NR==FNR{pats[$2]; next} $3 in pats' file1 file2

is not appropriate as the order of the printing is given by file2 and not by file1 (i.e. it prints to screen first 279 and then 502).

Thanks a lot for your help.


回答1:


You can reverse files to be processed in awk and get the right output:

awk 'NR==FNR{pats[$3]=$1; next} $2 in pats{print pats[$2]}' file2 file1
502
279


来源:https://stackoverflow.com/questions/33389595/find-patterns-of-a-file-in-another-file-and-print-out-a-corresponding-field-of-t

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!