match index notation of file1 to the index of file2 and pull out matching rows

后端 未结 5 638
夕颜
夕颜 2021-01-29 04:47

file1 contains multiple alphabetic sequences:

AETYUIOOILAKSJ
EAYEURIOPOSIDK
RYXURIAJSKDMAO
URITORIEJAHSJD
YWQIAKSJDHFKCM
HAJSUDIDSJSIAJ
AJDHDPFDIXSIBJ
JAQIAUXCNC         


        
5条回答
  •  天命终不由人
    2021-01-29 05:11

    awk '(NR==FNR){a[$0]=substr($0,length);next}
         { for(key in a) if (a[key] == substr($0,key+0,1)) { print; break }
         }' file2 file1
    

    Here, the array a[key] is a associative array with the following key-value pairs:

    key:   value
    3T     T
    10K    K
    ...    ...
    

    When processing file2 with the line: (NR==FNR){a[$0]=substr($0,length);next}: we extract the value beforehand so we don't have to do it later on. The index is easily extracted with a math operation. Eg. "10K"+0=10 in Awk.

    Processing file1 is done with the next line. Here we just check if the character matches for any of the entries in the associative array.

提交回复
热议问题