问题
I need to calculate the mode of an identity number for each group of ages. Let's suposse the following table:
library(data.table)
DT = data.table(age=c(12,12,3,3,12),v=rnorm(5), number=c("122","125","5","5","122"))
So I created a function:
g <- function(number) {
ux <- unique(number)
ux[which.max(tabulate(match(number, ux)))]
}
H<-function(tabla){data.frame(MODA=g, count=nrow(tabla))}
clasif_edad1<-ddply(DF,.(age), H)
View(clasif_edad1)
But I ge tthe following error:
Error: arguments imply differing number of rows: 0, 1
The output should be:
age v number moda
12 0,631152199 122 122
12 0,736648714 125 122
3 0,545921527 5 5
3 0,59336284 5 5
12 0,836685437 122 122
Don't know what the problem is.
Thanks
回答1:
One approach:
> myfun <- function(x) unique(x)[which.max(table(x))]
> DT[ , moda := myfun(number), by = age]
> DT
age v number moda
1: 12 -0.9740026 122 122
2: 12 0.6893727 125 122
3: 3 -0.9558391 5 5
4: 3 -1.2317071 5 5
5: 12 -0.9568919 122 122
回答2:
You can use dplyr
for this:
library(dplyr)
modes_by_age <- summarise(group_by(DT, age), group_mode = g(number))
inner_join(DT, modes_by_age)
This gives your desired output:
Source: local data table [5 x 4]
age v number group_mode
1 3 0.5524352 5 5
2 3 0.2869912 5 5
3 12 0.8987475 122 122
4 12 0.9740715 125 122
5 12 2.5058450 122 122
回答3:
Here's a base R solution. You could compute the mode for each group and then merge with your original data:
merge(DT, setNames(aggregate(number~age, data=DT, g), c("age", "moda")), by="age")
# age v number moda
# 1: 3 1.7148357 5 5
# 2: 3 0.9504811 5 5
# 3: 12 -0.7648237 122 122
# 4: 12 0.9011115 125 122
# 5: 12 -0.8718779 122 122
There may be a data table-specific approach, but this would work even if DT
were a data frame.
回答4:
modef <- function(V)
{
k = 1
prev='xxxx'
max_value = 0
for (i in V)
{
if (prev == i)
{
k = k+1
}
else
{
if (k > max_value)
{
MODE_CALC = data.frame(
number = c(prev) ,
occurence = c(k) )
max_value = k
k = 1
}
k = 1
}
prev = i
}
print(MODE_CALC$number)
}
V = c(11, 11, 11, 11, 12, 12, 2, 2, 2, 2, 2, 2, 14, 14, 14, 15, 16, 17, 17, 17 ,17 ,
17, 18, 19)
modef(sort(V))
来源:https://stackoverflow.com/questions/25791018/mode-in-r-by-groups