Bash turning single comma-separated column into multi-line string

一世执手 提交于 2019-12-05 04:17:17

问题


In my input file, columns are tab-separated, and the values inside each column are comma-separated.

I want to print the first column with each comma separated value from the second.

Mary,Tom,David   cat,dog
Kevin   bird,rabbit
John    cat,bird
...

for each record in the second column ( eg cat,dog ) i want to split record into array of [ cat, dog ] and cross print this against the first column. giving output ( just for this line )

Mary,Tom,David   cat
Mary,Tom,David   dog

output for whole file should be be:

Mary,Tom,David   cat
Mary,Tom,David   dog
Kevin   bird
Kevin   rabbit
John    cat
John    bird
...

any suggest if i want to use awk or sed? Thanks


回答1:


With awk

awk '{split($2,a,",");for(i in a)print $1"\t"a[i]}' file

Splits the second column on commas and then for each split value, print the first column and that value

Also in sed

sed ':1;s/\(\([^\n]*\t\)[^\n]*\),\{1,\}/\1\n\2/;t1' file



回答2:


This might work for you (GNU sed):

sed -r 's/^((\S+\s+)[^,]+),/\1\n\2/;P;D' file

The process can be broken down to three commands: substitution, print and delete. Replace each , in the second field by a newline and the first field and the following spaces. Then print upto and including the newline and delete upto and including the newline and repeat. The key command is the D which will reinvoke the previous commands until the pattern space is entirely empty.




回答3:


process.sh

#!/bin/bash

while read col_one col_two; do
  IFS=, read -a explode <<< "$col_two";
  for val in "${explode[@]}"; do
    printf "%s\t%s\n" "$col_one" "$val";
  done;
done <"$1";

with input.txt as

Mary,Tom,David   cat,dog
Kevin   bird,rabbit
John    cat,bird

output

$ ./process.sh input.txt 
Mary,Tom,David  cat
Mary,Tom,David  dog
Kevin   bird
Kevin   rabbit
John    cat
John    bird



回答4:


with awk

awk '{split($2, aEl, ","); for (Eli in aEl) print $1 "\t" aEl[ Eli]}' YourFile

with sed

sed 'H;s/.*//;x
:cycle
   s/\(\n\)\([^[:cntrl:]]*[[:blank:]]\{1,\}\)\([^[:cntrl:]]*\),\([^,]*\)/\1\2\3\1\2\4/;t cycle
s/.//' YourFile


来源:https://stackoverflow.com/questions/33408762/bash-turning-single-comma-separated-column-into-multi-line-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!