column replacement with awk, with retaining the format

匿名 (未验证) 提交于 2019-12-03 02:38:01

问题:

I have a a.pdb file as,

ATOM      1  N   ARG     1       0.000   0.000   0.000  1.00  0.00           N ATOM      2  H1  ARG     1       0.000   0.000   0.000  1.00  0.00           H ATOM      3  H2  ARG     1       0.000   0.000   0.000  1.00  0.00           H ATOM      4  H3  ARG     1       0.000   0.000   0.000  1.00  0.00           H 

and a.xyz file as

16.388 -5.760 -23.332 17.226 -5.608 -23.768 15.760 -5.238 -23.831 17.921 -5.926 -26.697 

I want to replace 6,7 and 8th column of a.pdb with a.xyz. Once replaced, I need to maintain tabs/space/columns of a.pdb.

I have tried

awk 'NR==FNR {fld1[NR]=$1; fld2[NR]=$2; fld3[NR]=$3; next} {$6=fld1[FNR]; $7=fld2[FNR]; $8=fld3[FNR]}1' a.xyz a.pdb  

But it doesn't keep the format.

回答1:

This is exactly what the 4th arg for split() in GNU awk was invented to facilitate:

gawk ' NR==FNR { pdb[NR]=$0; next } {     split(pdb[FNR],flds,FS,seps)     flds[6]=$1     flds[7]=$2     flds[8]=$3     for (i=1;i in flds;i++)         printf "%s%s", flds[i], seps[i]     print "" } ' a.pdb a.xyz  ATOM      1  N   ARG     1       16.388   -5.760   -23.332  1.00  0.00           N ATOM      2  H1  ARG     1       17.226   -5.608   -23.768  1.00  0.00           H ATOM      3  H2  ARG     1       15.760   -5.238   -23.831  1.00  0.00           H ATOM      4  H3  ARG     1       17.921   -5.926   -26.697  1.00  0.00           H 


回答2:

Not a general solution, but this might work with in this particular case:

awk 'NR==FNR{for(i=6; i<=8; i++) A[FNR,i]=$(i-5); next} {for(i=6; i<=8; i++) sub($i,A[FNR,i])}1' file2 file1 

or

awk '{for(i=6; i<=8; i++) if(NR==FNR) A[FNR,i]=$(i-5); else sub($i,A[FNR,i])} NR>FNR' file2 file1 

There is a bit of a shift, though. We would need to know the fields widths to prevent this.

-- Or perhaps with substrings:

awk 'NR==FNR{A[FNR]=$0; next} {print substr($0,1,p) FS A[FNR] substr($0,p+length(A[FNR]))}' p=33 file2 file1 

-- changing it in the OP's original solution:

awk 'NR==FNR {fld1[NR]=$1; fld2[NR]=$2; fld3[NR]=$3; next} {sub($6,fld1[FNR]); sub($7,fld2[FNR]); sub($8,fld3[FNR])}1' file file1 

with the same restrictions as the first 2 suggestions.

So 1, 2, and 4 use sub to replace, which is not a water proof solution, since earlier fields might interfere and it uses regex rather than strings (and so the regex dot happens to match the actual dot), but with this particular input, it might pan out..

Probably nr. 3 would be a more fool-proof method..

--edit-- I think this would work with the given input:

awk 'NR==FNR{A[FNR]=$1 "  " $2 " " $3; next} {print substr($0,1,p) A[FNR] substr($0,p+length(A[FNR]))}' p=32  file2 file1 

but I think something like printf or sprint formatting would be required to make it fool-proof. So, perhaps something like this:

awk 'NR==FNR{A[FNR]=sprintf("%7.3f %7.3f %8.4f", $1, $2, $3); next} {print substr($0,1,p) A[FNR] substr($0,p+length(A[FNR]))}' p=31 file2 file1 

or not on one line:

awk '   NR==FNR {     A[FNR]=sprintf("%7.3f %7.3f %8.4f", $1, $2, $3)     next   }   {     print substr($0,1,p) A[FNR] substr($0,p+length(A[FNR]))   } ' p=31 file2 file1 


回答3:

You can try this one

paste -d' '  test4 test5 |awk '{print $1,$2,$3,$4,$5,$12,$13,$14,$9,$10,$11}' 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!