问题
I have a problem in R, which I can't seem to solve.
I have the following dataframe:
Image 1
I would like to:
- Find the unique combinations of the columns 'Species' and 'Effects'
- Report the concentration belonging to this unique combination
- If this unique combination is present more than one time, calculate the mean concentration
And would like to get the following dataframe:
Image 2
I have tried next script to get the unique combinations:
UniqueCombinations <- Data[!duplicated(Data[,1:2]),]
but don't know how to proceed from there.
Thanks in advance for your answers!
Tina
回答1:
Try the following (Thanks Brandon Bertelsen for nice comment):
Creating your data:
foo = data.frame(Species=c(rep("A",4),"B",rep("C",3),"D","D"),
Effect=c(rep("Reproduction",3), rep("Growth",2),
"Reproduction", rep("Mortality",2), rep("Growth",2)),
Concentration=c(1.2,1.4,1.3,1.5,1.6,1.2,1.1,1,1.3,1.4))
Using great package plyr
for a bit of magic :)
library(plyr)
ddply(foo, .(Species,Effect), function(x) mean(x[,"Concentration"]))
And this is a bit more complicated, but cleaner version (Thanks again to Brandon Bertelsen):
ddply(foo, .(Species,Effect), summarize, mean=mean(Concentration))
回答2:
Create some example data:
dat <- data.frame(Species = rep.int(LETTERS[1:4], c(4, 1, 3, 2)),
Effect = c(rep("Reproduction", 3), "Growth", "Growth",
"Reproduction", "Mortality", "Mortality",
"Growth", "Growth"),
Concentration = rnorm(10))
You can use the function aggregate
:
aggregate(Concentration ~ Species + Effect, dat, mean)
回答3:
Just for fun before I call it a night.... Assuming your data.frame
is called "dat", here are two more options:
A
data.table
solution.library(data.table) datDT <- data.table(dat, key="Species,Effect") datDT[, list(Concentration = mean(Concentration)), by = key(datDT)] # Species Effect Concentration # 1: A Growth 1.50 # 2: A Reproduction 1.30 # 3: B Growth 1.60 # 4: C Mortality 1.05 # 5: C Reproduction 1.20 # 6: D Growth 1.35
An
sqldf
solution.library(sqldf) sqldf("select Species, Effect, avg(Concentration) `Concentration` from dat group by Species, Effect") # Species Effect Concentration # 1 A Growth 1.50 # 2 A Reproduction 1.30 # 3 B Growth 1.60 # 4 C Mortality 1.05 # 5 C Reproduction 1.20 # 6 D Growth 1.35
来源:https://stackoverflow.com/questions/13017511/find-unique-combinations-based-on-two-columns-and-calculate-the-mean