Multiple pairwise differences based on column name patterns

后端 未结 3 1198
悲哀的现实
悲哀的现实 2021-01-15 02:31

I have a data.table, dt:

dt

Id  v1 v2 v3 x1 x2 x3
1   7  1  3  5  6  8
2   1  3  5  6  8  5
3   3  5  6  8  5  1

v1, v2, v3 an

3条回答
  •  [愿得一人]
    2021-01-15 02:41

    Your data looks like it belongs in a long format, for which the calculation you're after would become trivial:

    # reshape
    DT_long = melt(DT, id.vars='Id', measure.vars = patterns(v = '^v', x = '^x'))
    DT_long
    #       Id variable     v     x
    # 1:     1        1     7     5
    # 2:     2        1     1     6
    # 3:     3        1     3     8
    # 4:     1        2     1     6
    # 5:     2        2     3     8
    # 6:     3        2     5     5
    # 7:     1        3     3     8
    # 8:     2        3     5     5
    # 9:     3        3     6     1
    

    Now it's easy:

    DT_long[ , diff := v - x][]
    #       Id variable     v     x  diff
    # 1:     1        1     7     5     2
    # 2:     2        1     1     6    -5
    # 3:     3        1     3     8    -5
    # 4:     1        2     1     6    -5
    # 5:     2        2     3     8    -5
    # 6:     3        2     5     5     0
    # 7:     1        3     3     8    -5
    # 8:     2        3     5     5     0
    # 9:     3        3     6     1     5
    

    You can then use dcast to reshape back to wide, but it's usually worth considering whether keeping the dataset in this long form is better for the whole analysis.

提交回复
热议问题