replace tab in an enclosed string in a tab delimited file linux

时间秒杀一切 提交于 2019-12-11 10:18:31

问题


I have a tab delimited txt file in which third column contains enclosed string that might also has a tab. Because of this extra tab i am getting 5 columns when i try to read this tab delimited file. So i want to replace the tab with space.

Following is the sample file.

col1   col2   col3        col4  
1      abc    "pqr   xyz" asd  
2      asd    "lmn   pqr" aws  
3      abc    "asd"       lmn

I want the output like this

col1   col2   col3        col4  
1      abc    "pqr xyz"   asd  
2      asd    "lmn pqr"   aws  
3      abc    "asd"       lmn

Here is what i have tried

awk -F"\t" '{ gsub("\t","",$3); print $3 }' file.txt

after that i am getting following output

col3  
"pqr  
"lmn  
"asd"

Please help


回答1:


Having GNU awk (gawk) you can use the following expression:

gawk '{gsub("\t"," ",$3)}1' OFS='\t' FPAT='"[^"]*"|[^\t]*' file

The key here is the FPAT variable. It defines how a field can look like instead of just specifying the field delimiter.

In our case a field can either be an sequence of non-double-quote chars enclosed in double quotes "[^"]*" or a sequence of zero or more non tab characters [^\t]*. (zero, to handle empty fields properly)

Since we are specifying the sequence of non quote characters first it has a precedence.



来源:https://stackoverflow.com/questions/31582847/replace-tab-in-an-enclosed-string-in-a-tab-delimited-file-linux

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!