Using multiple delimiters in awk

问题

I have a file which contain following lines:

/logs/tc0001/tomcat/tomcat7.1/conf/catalina.properties:app.env.server.name = demo.example.com
/logs/tc0001/tomcat/tomcat7.2/conf/catalina.properties:app.env.server.name = quest.example.com
/logs/tc0001/tomcat/tomcat7.5/conf/catalina.properties:app.env.server.name = www.example.com

In above output I want to extract 3 fields (Number 2, 4 and the last one *.example.com). I am getting the following output:

cat file | awk -F'/' '{print $3 "\t" $5}'
tc0001   tomcat7.1
tc0001   tomcat7.2
tc0001   tomcat7.5

How do I also extract last field with domain name which is after '='? How do I use multiple delimiter to extract field?

回答1:

The delimiter can be a regular expression.

awk -F'[/=]' '{print $3 "\t" $5 "\t" $8}' file

Produces:

tc0001   tomcat7.1    demo.example.com  
tc0001   tomcat7.2    quest.example.com  
tc0001   tomcat7.5    www.example.com

回答2:

Good news! awk field separator can be a regular expression. You just need to use -F"<separator1>|<separator2>|...":

awk -F"/|=" '{print $3, $5, $NF}' file

Returns:

tc0001 tomcat7.1  demo.example.com
tc0001 tomcat7.2  quest.example.com
tc0001 tomcat7.5  www.example.com

Here:

-F="/|=" sets the input field separator to either / or =. Then, it sets the output field separator to a tab.
{print $3, $5, $NF} prints the 3rd, 5th and last fields based on the input field separator.

See another example:

$ cat file
hello#how_are_you
i#am_very#well_thank#you

This file has two fields separators, # and _. If we want to print the second field regardless of the separator being one or the other, let's make both be separators!

$ awk -F"#|_" '{print $2}' file
how
am

Where the files are numbered as follows:

hello#how_are_you           i#am_very#well_thank#you
^^^^^ ^^^ ^^^ ^^^           ^ ^^ ^^^^ ^^^^ ^^^^^ ^^^
  1    2   3   4            1  2   3    4    5    6

回答3:

If your whitespace is consistent you could use that as a delimiter, also instead of inserting \t directly, you could set the output separator and it will be included automatically:

< file awk -v OFS='\t' -v FS='[/ ]' '{print $3, $5, $NF}'

回答4:

For a field separator of any number 2 through 5 or letter a or # or a space, where the separating character must be repeated at least 2 times and not more than 6 times, for example:

awk -F'[2-5a# ]{2,6}' ...

I am sure variations of this exist using ( ) and parameters

回答5:

Perl one-liner:

perl -F'/[\/=]/' -lane 'print "$F[2]\t$F[4]\t$F[7]"' file

These command-line options are used:

-n loop around every line of the input file, put the line in the $_ variable, do not automatically print every line
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – perl will automatically split input lines into the @F array. Defaults to splitting on whitespace
-F autosplit modifier, in this example splits on either / or =
-e execute the perl code

Perl is closely related to awk, however, the @F autosplit array starts at index $F[0] while awk fields start with $1.

回答6:

I see many perfect answers are up on the board, but still would like to upload my piece of code too,

awk -F"/" '{print $3 " " $5 " " $7}' sam | sed 's/ cat.* =//g'

来源：https://stackoverflow.com/questions/12204192/using-multiple-delimiters-in-awk

标签

awk

command-line

text-processing