R how can I calculate difference between rows in a data frame

前端 未结 6 2051
挽巷
挽巷 2020-12-03 01:13

Here is a simple example of my problem:

> df <- data.frame(ID=1:10,Score=4*10:1)
> df
       ID Score
    1   1    40
    2   2    36
    3   3    3         


        
6条回答
  •  一生所求
    2020-12-03 01:42

    Because df works on vector or matrix. You can use apply to apply the function across columns like so:

     apply( df , 2 , diff )
       ID Score
    2   1    -4
    3   1    -4
    4   1    -4
    5   1    -4
    6   1    -4
    7   1    -4
    8   1    -4
    9   1    -4
    10  1    -4
    

    It seems unlikely that you want to calculate the difference in sequential IDs, so you could choose to apply it on all columns except the first like so:

    apply( df[-1] , 2 , diff )
    

    Or you could use data.table (not that it adds anything here I just really want to start using it!), and I am again assuming that you do not want to apply diff to the ID column:

    DT <- data.table(df)
    DT[ , list(ID,Score,Diff=diff(Score))  ]
        ID Score Diff
     1:  1    40   -4
     2:  2    36   -4
     3:  3    32   -4
     4:  4    28   -4
     5:  5    24   -4
     6:  6    20   -4
     7:  7    16   -4
     8:  8    12   -4
     9:  9     8   -4
    10: 10     4   -4
    

    And thanks to @AnandaMahto an alternative syntax that gives more flexibility to choose which columns to run it on could be:

    DT[, lapply(.SD, diff), .SDcols = 1:2]
    

    Here .SDcols = 1:2 means you want to apply the diff function to columns 1 and 2. If you have 20 columns and didn't want to apply it to ID you could use .SDcols=2:20 as an example.

提交回复
热议问题