How do I use plyr to number rows?

被刻印的时光 ゝ 提交于 2019-12-17 17:52:36

问题


Basically I want an autoincremented id column based on my cohorts - in this case .(kmer, cvCut)

    > myDataFrame
       size kmer cvCut   cumsum
1      8132   23    10     8132
10000   778   23    10 13789274
30000   324   23    10 23658740
50000   182   23    10 28534840
100000   65   23    10 33943283
200000   25   23    10 37954383
250000  584   23    12 16546507
300000  110   23    12 29435303
400000   28   23    12 34697860
600000  127   23     2 47124443
600001  127   23     2 47124570

I want a column added that has new row names based on the kmer/cvCut group

    > myDataFrame
       size kmer cvCut   cumsum  newID
1      8132   23    10     8132      1
10000   778   23    10 13789274      2
30000   324   23    10 23658740      3
50000   182   23    10 28534840      4
100000   65   23    10 33943283      5 
200000   25   23    10 37954383      6
250000  584   23    12 16546507      1
300000  110   23    12 29435303      2
400000   28   23    12 34697860      3
600000  127   23     2 47124443      1
600001  127   23     2 47124570      2

回答1:


I'd do it like this:

library(plyr)
ddply(df, c("kmer", "cvCut"), transform, newID = seq_along(kmer))



回答2:


Just add a new column each time plyr calls you:

R> DF <- data.frame(kmer=sample(1:3, 50, replace=TRUE), \
                    cvCut=sample(LETTERS[1:3], 50, replace=TRUE))
R> library(plyr)
R> ddply(DF, .(kmer, cvCut), function(X) data.frame(X, newId=1:nrow(X)))
   kmer cvCut newId
1     1     A     1
2     1     A     2
3     1     A     3
4     1     A     4
5     1     A     5
6     1     A     6
7     1     A     7
8     1     A     8
9     1     A     9
10    1     A    10
11    1     A    11
12    1     B     1
13    1     B     2
14    1     B     3
15    1     B     4
16    1     B     5
17    1     B     6
18    1     C     1
19    1     C     2
20    1     C     3
21    2     A     1
22    2     A     2
23    2     A     3
24    2     A     4
25    2     A     5
26    2     B     1
27    2     B     2
28    2     B     3
29    2     B     4
30    2     B     5
31    2     B     6
32    2     B     7
33    2     C     1
34    2     C     2
35    2     C     3
36    2     C     4
37    3     A     1
38    3     A     2
39    3     A     3
40    3     A     4
41    3     B     1
42    3     B     2
43    3     B     3
44    3     B     4
45    3     C     1
46    3     C     2
47    3     C     3
48    3     C     4
49    3     C     5
50    3     C     6
R> 



回答3:


I think that this is what you want:

Load the data:

x <- read.table(textConnection(
"id      size kmer cvCut   cumsum
1      8132   23    10     8132
10000   778   23    10 13789274
30000   324   23    10 23658740
50000   182   23    10 28534840
100000   65   23    10 33943283
200000   25   23    10 37954383
250000  584   23    12 16546507
300000  110   23    12 29435303
400000   28   23    12 34697860
600000  127   23     2 47124443
600001  127   23     2 47124570"), header=TRUE)

Use ddply:

library(plyr)
ddply(x, .(kmer, cvCut), function(x) cbind(x, 1:nrow(x)))


来源:https://stackoverflow.com/questions/2187448/how-do-i-use-plyr-to-number-rows

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!