Replace first two whitespace occurrences with a comma using sed

匿名 (未验证) 提交于 2019-12-03 08:48:34

问题:

I have a whitespace delimited file with a variable number of entries on each line. I want to replace the first two whitespaces with commas to create a comma delimited file with three columns.

Here's my input:

a b  1 2 3 3 2 1 c d  44 55 66 2355 line http://google.com 100 200 300 ef jh  77 88 99 z y 2 3 33 

And here's my desired output:

a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line,http://google.com,100 200 300 ef,jh,77 88 99 z,y,2 3 33 

I'm trying to use perl regular expressions in a sed command but I can't quite get it to work. First I try capturing a word, followed by a space, then another word, but that only works for lines 1, 2, and 5:

$ cat test | sed -r 's/(\w)\s+(\w)\s+/\1,\2,/' a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line http://google.com 100 200 300 ef jh  77 88 99 z,y,2 3 33 

I also try capturing whitespace, a word, and then more whitespace, but that gives me the same result:

$ cat test | sed -r 's/\s+(\w)\s+/,\1,/' a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line http://google.com 100 200 300 ef jh  77 88 99 z,y,2 3 33 

I also try doing this with the .? wildcard, but that does something funny to line 4.

$ cat test | sed -r 's/\s+(.?)\s+/,\1,/' a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line http://google.com 100 200 300 ef jh,,77 88 99 z,y,2 3 33 

Any help is much appreciated!

回答1:

How about this:

sed -e 's/\s\+/,/' | sed -e 's/\s\+/,/' 

It's probably possible with a single sed command, but this is sure an easy way :)

My output:

a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line,http://google.com,100 200 300 ef,jh,77 88 99 z,y,2 3 33 


回答2:

Try this:

sed -r 's/\s+(\S+)\s+/,\1,/' 

Just replaced \w (one "word" char) with \S+ (one or more non-space chars) in one of your attempts.



回答3:

You can provide multiple commands to a single instance of sed by just providing multiple -e arguments.

To do the first two, just use:

sed -e 's/\s\+/,/' -e 's/\s\+/,/' 

This basically runs both commands on the line in sequence, the first doing the first block of whitespace, the second doing the next.

The following transcript shows this in action:

pax$ echo 'a b  1 2 3 3 2 1 c d  44 55 66 2355 line http://google.com 100 200 300 ef jh  77 88 99 z y 2 3 33 ' | sed -e 's/\s\+/,/' -e 's/\s\+/,/'  a,b,1 2 3 3 2 1 c,d,44 55 66 2355 line,http://google.com,100 200 300 ef,jh,77 88 99 z,y,2 3 33 


回答4:

Sed s/// supports a way to say which occurrence of a pattern to replace: just add the n to the end of the command to replace only the nth occurrence. So, to replace the first and second occurrences of whitespace, just use it this way:

$ sed 's/  */,/1;s/  */,/2' input a,b ,1 2 3 3 2 1 c,d ,44 55 66 2355 line,http://google.com 100,200 300 ef,jh ,77 88 99 z,y 2,3 33 

EDIT: reading another proposed solutions, I noted that the 1 and 2 after s/ */,/ is not only unnecessary but plainly wrong. By default, s/// just replaces the first occurrence of the pattern. So, if we have two identical s/// in sequence, they will replace the first and the second occurrence. What you need is just

$ sed 's/  */,/;s/  */,/' input  

(Note that you can put two sed commands in one expression if you separate them by a semicolon. Some sed implementations do not accept the semicolon after the s/// command; use a newline to separate the commands, in this case.)



回答5:

A Perl solution is:

perl -pe '$_=join ",", split /\s+/, $_, 3' some.file 


回答6:

Not sure about sed/perl, but here's an (ugly) awk solution. It just prints fields 1-2, separated by commas, then the remaining fields separated by space:

awk '{   printf("%s,", $1)   printf("%s,", $2)   for (i=3; i<=NF; i++)     printf("%s ", $i)     printf("\n") }' myfile.txt 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!