问题
I have a file containing a big table, something like that :
Header1 Header2 Header3 ... Header8031
Value1 Value2 Value3 .... Value8031
.
.
Value1 Value2 Value3 ... Value8031
In another file I have a list with some headers of the previous table.
Header1
Header3000
Header5
Header200
Header10
I want to extract the information in the table only for the headers in the list. In other words, getting the columns that match with the headers on the list. [matching the list with the columns id on the tables]
Output
Header1 Header3000 Header5 Header200 Header10
Value1 Value3000 Value5 Value200 Value10
Value1 Value3000 Value5 Value200 Value10
I tried some examples with awk (AWK extract columns from file based on header selected from 2nd file), but I'm not able to get my desired output.
回答1:
this awk line would work for you:
awk 'NR==FNR{a[$0]=7;next}FNR==1{for(i=1;i<=NF;i++)if(a[$i])c[++x]=i}
{for(i=1;i<=x;i++)printf "%s%s", $(c[i]), (i==x?RS:FS)}' headerFile dataFile
test with example:
kent$ head col f
==> col <==
Header1
Header3
Header5
==> f <==
Header1 Header2 Header3 Header4 Header5 Header10
Value1 Value2 Value3 Value4 VAlue5 Value10
Value1 Value2 Value3 Value4 Value5 Value10
kent$ awk 'NR==FNR{a[$0]=7;next}FNR==1{for(i=1;i<=NF;i++)if(a[$i])c[++x]=i}
{for(i=1;i<=x;i++)printf "%s%s", $(c[i]), (i==x?RS:FS)}' col f
Header1 Header3 Header5
Value1 Value3 VAlue5
Value1 Value3 Value5
回答2:
I would use a little script like this:
FNR==NR {a[$1]; next}
FNR==1 { for (i=1;i<=NF;i++) if ($i in a) b[i] }
{ for (i=1; i<=NF; i++) if (i in b) printf "%s%s", $i, FS
print ""
}
Explanation
- First read the second file and store the name of the columns.
- Then read the first file.
- On the first line, store the column number of the columns we want to print.
- From then on, print those desired column numbers.
Test
$ cat f1
Header1 Header2 Header3 Header8031
Value1 Value2 Value3 Value8031
Value1 Value2 Value3 Value8031
$ cat f2
Header1
Header3000
Header5
Header200
Header10
Header3
Test:
$ awk -f a.awk f2 f1
Header1 Header3
Value1 Value3
Value1 Value3
来源:https://stackoverflow.com/questions/26605938/extracting-columns-from-a-file