问题
I would appreciate any help to create a function that allows me to create categories of one variable using the order of a set of other variables values.
Specifically, I want a function that:
- creates category
E1
of the variablevariable
thefirst
time that each combination of values of the variablesA
,B
, andID
appears in the dataset.- creates category
E2
of the variablevariable
thesecond
time that each combination of values of the variablesA
,B
, andID
appears in the dataset.- creates category
E3
of the variablevariable
thethird
time that each combination of values of the variablesA
,B
, andID
appears in the dataset.- creates category
En
of the variablevariable
thenth
time that each combination of values of the variablesA
,B
, andID
appears in the dataset.
#sample data:
rowdT<-structure(list(A = c("a1", "a2", "a1", "a1", "a2", "a1", "a1",
"a2", "a1"), B = c("b2", "b2", "b2", "b1", "b2", "b2", "b1",
"b2", "b1"), ID = c("3", "4", "3", "1", "4", "3", "1", "4", "1"
), E = c(0.621142094943352, 0.742109450696123, 0.39439152996948,
0.40694392882818, 0.779607277916503, 0.550579323666347, 0.352622183880119,
0.690660491345867, 0.23378944873769)), class = c("data.table",
"data.frame"), row.names = c(NA, -9L))
sampleDT <- melt(rowdT, id.vars = c("A", "B", "ID"))
#input data:
A B ID variable value
1: a1 b2 3 E 0.6211421
2: a2 b2 4 E 0.7421095
3: a1 b2 3 E 0.3943915
4: a1 b1 1 E 0.4069439
5: a2 b2 4 E 0.7796073
6: a1 b2 3 E 0.5505793
7: a1 b1 1 E 0.3526222
8: a2 b2 4 E 0.6906605
9: a1 b1 1 E 0.2337894
#expected output:
A B ID variable value
4: a1 b1 1 E1 0.4069439
1: a1 b2 3 E1 0.6211421
2: a2 b2 4 E1 0.7421095
7: a1 b1 1 E2 0.3526222
3: a1 b2 3 E2 0.3943915
5: a2 b2 4 E2 0.7796073
9: a1 b1 1 E3 0.2337894
6: a1 b2 3 E3 0.5505793
8: a2 b2 4 E3 0.6906605
Thanks in advance for any help.
回答1:
First convert your variable to a character vector for proper coercion, and then use data.table
sampleDT$variable = as.character(sampleDT$variable)
sampleDT[, variable := paste(variable,1:.N,sep = ""), by = c("A", "B", "ID")]
This creates unique tallies based on the observed combinations of A
, B
, and ID
.
This gets the following output:
A B ID variable value
1: a1 b2 3 E1 0.6211421
2: a2 b2 4 E1 0.7421095
3: a1 b2 3 E2 0.3943915
4: a1 b1 1 E1 0.4069439
5: a2 b2 4 E2 0.7796073
6: a1 b2 3 E3 0.5505793
7: a1 b1 1 E2 0.3526222
8: a2 b2 4 E3 0.6906605
9: a1 b1 1 E3 0.2337894
which you can reorder if necessary.
来源:https://stackoverflow.com/questions/54488761/how-to-create-categories-conditionally-using-other-variables-values-and-sequence