问题
I want to extract the numbers following client_id and id and pair up client_id and id in each line.
For example, for the following lines of log,
User(client_id:03)) results:[RelatedUser(id:204, weight:10),_RelatedUser(id:491,_weight:10),_RelatedUser(id:29, weight: 20)
User(client_id:04)) results:[RelatedUser(id:209, weight:10),_RelatedUser(id:301,_weight:10)
User(client_id:05)) results:[RelatedUser(id:20, weight: 10)
I want to output
03 204
03 491
03 29
04 209
04 301
05 20
I know I need to use sed or awk. But I do not know exactly how.
Thanks
回答1:
Here's a awk script that works (I put it on multiple lines and made it a bit more verbose so you can see what's going on):
#!/bin/bash
awk 'BEGIN{FS="[\(\):,]"}
/client_id/ {
cid="no_client_id"
for (i=1; i<NF; i++) {
if ($i == "client_id") {
cid = $(i+1)
} else if ($i == "id") {
id = $(i+1);
print cid OFS id;
}
}
}' input_file_name
Output:
03 204
03 491
03 29
04 209
04 301
05 20
Explanation:
awk 'BEGIN{FS="[\(\):,]"}: invokeawk, use():and,as delimiters to separate your fields/client_id/ {: Only do the following for the lines that containclient_id:for (i=1; i<NF; i++) {: iterate through the fields on each line one field at a timeif ($i == "client_id") { cid = $(i+1) }: if the field we are currently on isclient_id, then its value is the next field in order.else if ($i == "id") { id = $(i+1); print cid OFS id;}: otherwise if the field we are currently on isid, then print theclient_id : idpair ontostdoutinput_file_name: supply the name of your input file as first argument to theawkscript.
回答2:
This may work for you:
awk -F "[):,]" '{ for (i=2; i<=NF; i++) if ($i ~ /id/) print $2, $(i+1) }' file
Results:
03 204
03 491
03 29
04 209
04 301
05 20
回答3:
This might work for you (GNU sed):
sed -r '/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!d;s//\2 \3\n\1/;P;D' file
/.*(\(client_id:([0-9]+))[^(]*\(id:([0-9]+)/!dif the line doesn't have the intended strings delete it.s//\2 \3\n\1/re-arrange the line by copying theclient_idand moving the firstidahead thus reducing the line for successive iterations.Pprint upto the introduced newline.Ddelete upto the introduced newline.
回答4:
I would prefer awk for this, but if you were wondering how to do this with sed, here's one way that works with GNU sed.
parse.sed
/client_id/ {
:a
s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/
ta
s/^[^\n]+\n//
}
Run it like this:
sed -rf parse.sed infile
Or as a one-liner:
<infile sed '/client_id/ { :a; s/(client_id:([0-9]+))[^(]+\(id:([0-9]+)([^\n]+)(.*)/\1 \4\5\n\2 \3/; ta; s/^[^\n]+\n//; }'
Output:
03 204
03 491
03 29
04 209
04 301
05 20
Explanation:
The idea is to repeatedly match client_id:([0-9]+) and id:([0-9]+) pairs and put them at the end of pattern space. On each pass the id:([0-9]+) is removed.
The final replace removes left-overs from the loop.
来源:https://stackoverflow.com/questions/13464932/extract-a-string-after-a-pattern