Here is a simple example of my problem:
> df <- data.frame(ID=1:10,Score=4*10:1)
> df
ID Score
1 1 40
2 2 36
3 3 3
Because df works on vector or matrix. You can use apply to apply the function across columns like so:
apply( df , 2 , diff )
ID Score
2 1 -4
3 1 -4
4 1 -4
5 1 -4
6 1 -4
7 1 -4
8 1 -4
9 1 -4
10 1 -4
It seems unlikely that you want to calculate the difference in sequential IDs, so you could choose to apply it on all columns except the first like so:
apply( df[-1] , 2 , diff )
Or you could use data.table
(not that it adds anything here I just really want to start using it!), and I am again assuming that you do not want to apply diff
to the ID column:
DT <- data.table(df)
DT[ , list(ID,Score,Diff=diff(Score)) ]
ID Score Diff
1: 1 40 -4
2: 2 36 -4
3: 3 32 -4
4: 4 28 -4
5: 5 24 -4
6: 6 20 -4
7: 7 16 -4
8: 8 12 -4
9: 9 8 -4
10: 10 4 -4
And thanks to @AnandaMahto an alternative syntax that gives more flexibility to choose which columns to run it on could be:
DT[, lapply(.SD, diff), .SDcols = 1:2]
Here .SDcols = 1:2
means you want to apply the diff
function to columns 1 and 2. If you have 20 columns and didn't want to apply it to ID you could use .SDcols=2:20
as an example.