问题
I wanted to compute some column differences in a csv file, say
file:
item1,0.01,0.1
item2,0.02,0.2
item3,0.03,0.3
expected output file:
item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27
I tried something like this:
awk -F, '{print $2-$3 "," $0}'
and got the difference in the first column, but unable to put it in the 4th! The following didn't work and gave me strange result like: ',$0[original line]'.
awk -F, '{print $0 "," $2-$3}'
What's happening here? And how to fix this? I'm using GNU awk under bash.
Also tried tips from: subtract the values of two columns using awk or bash e.g.,
awk '{ $4 = $2 - $3 } 1'
but didn't get expected result either. What does that '1' do in the end of the command?
UPDATE: I think there is something wrong in my real data file:
fa8befbbf03c5539363996a576d5df20,0.725571036339,0.654274122734
fb51f93cc69b6be7375f518092330197,0.941242694855,0.888087145568
fc35b866ed1b3176193ccab251394cf2,0.0169462561607,0.10700264598
fd43d08452687499c00dc62511e5fb8c,0.13467258215,0.197959610293
fe4e8d77fa1770a331b3fca0f712d1a2,0.732236325741,0.302812807639
ff339fd5b4bfc7e916591ecc88286584,0.0581884384155,0.276936129794
ff34734e135192a75838d18e870bec86,0.941790342331,0.680042603973
ff34be2a8759cadcae3ea0fc74d7ef7e,0.111128211021,0.0429052298147
ff910f590b8d19dbc135d69a4bb6dc3e,0.400317430496,0.623828952199
ff9be3a6286f90d0b3ce7b049ac1cb9a,0.0130054950714,0.0511833470525
This seems to break the proposed two solutions below.
$ awk -F, -v OFS="," '{$4=$2-$3}1' file2.csv
,0.07129693c5539363996a576d5df20,0.725571036339,0.654274122734
,0.05315559b6be7375f518092330197,0.941242694855,0.888087145568
,-0.0900564b3176193ccab251394cf2,0.0169462561607,0.10700264598
,-0.063287687499c00dc62511e5fb8c,0.13467258215,0.197959610293
,0.429424a1770a331b3fca0f712d1a2,0.732236325741,0.302812807639
,-0.218748bfc7e916591ecc88286584,0.0581884384155,0.276936129794
,0.26174835192a75838d18e870bec86,0.941790342331,0.680042603973
,0.068223759cadcae3ea0fc74d7ef7e,0.111128211021,0.0429052298147
,-0.2235128d19dbc135d69a4bb6dc3e,0.400317430496,0.623828952199
,-0.0381779f90d0b3ce7b049ac1cb9a,0.0130054950714,0.0511833470525
They both worked well on the example data file.
Thanks!
回答1:
$ awk -F, -v OFS="," '{$4=$2-$3}1' file1
# Or awk -F, -v OFS="," '{$(NF+1)=$2-$3}1' file1
item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27
$NF is the last field.$(NF+1) is one extra field after last fieldOFS is the output field separatorF (or FS) is the input field separator
What the 1 does:
awk syntax always follows the rule condition{actions}
You can omit conditions or you can omit actions but not both.
If you omit conditions, condition 1 is assumed = True = Always do the following {actions}
If you omit {actions} , default action is performed --> {print $0}
So a single 1 is a simple true condition = perform default action = {print $0} - as $0 has been so far modified
回答2:
$ cat items.csv
item1,0.01,0.1
item2,0.02,0.2
item3,0.03,0.3
$ awk -F, -v OFS=',' '{ print $0,$2-$3 }' items.csv
item1,0.01,0.1,-0.09
item2,0.02,0.2,-0.18
item3,0.03,0.3,-0.27
来源:https://stackoverflow.com/questions/44530821/use-awk-to-compute-difference-of-two-columns