Multiply various subsets of a data frame by different vectors

送分小仙女□ 提交于 2020-01-12 08:43:13

问题


I would like to multiply several columns in my data frame by a vector of values. The specific vector of values changes depending on the value in another column.

--EDIT--

What if I make the data set more complicated, i.e., more than 2 conditions and the conditions are randomly shuffled around the data set?

Here is an example of my data set:

df=data.frame(
  Treatment=(rep(LETTERS[1:4],each=2)),
  Species=rep(1:4,each=2),
  Value1=c(0,0,1,3,4,2,0,0),
  Value2=c(0,0,3,4,2,1,4,5),
  Value3=c(0,2,4,5,2,1,4,5),
  Condition=c("A","B","A","C","B","A","B","C")
  )

Which looks like:

 Treatment Species Value1 Value2 Value3 Condition
     A       1      0      0      0         A
     A       1      0      0      2         B 
     B       2      1      3      4         A
     B       2      3      4      5         C
     C       3      4      2      2         B
     C       3      2      1      1         A
     D       4      0      4      4         B
     D       4      0      5      5         C

If Condition=="A", I would like to multiply columns 3-5 by the vector c(1,2,3). If Condition=="B", I would like to multiply columns 3-5 by the vector c(4,5,6). If Condition=="C", I would like to multiply columns 3-5 by the vector c(0,1,0). The resulting data frame would therefore look like this:

 Treatment Species Value1 Value2 Value3 Condition
     A       1      0      0      0         A
     A       1      0      0     12         B 
     B       2      1      6     12         A
     B       2      0      4      0         C
     C       3     16     10     12         B
     C       3      2      2      3         A
     D       4      0     20     24         B
     D       4      0      5      0         C

I have tried subsetting the data frame and multiplying by the vector:

t(t(subset(df[,3:5],df[,6]=="A")) * c(1,2,3))

But I can't return the subsetted data frame to the original. Is there any way to perform this operation without subsetting the data frame, so that other columns (e.g., Treatment, Species) are preserved?


回答1:


Here's a fairly general solution that you should be able to adapt to fit your needs.

Note the first argument in the outer call is a logical vector and the second is numeric, so before multiplication TRUE and FALSE are converted to 1 and 0, respectively. We can add the outer results because the conditions are non-overlapping and the FALSE elements will be zero.

multiples <-
  outer(df$Condition=="A",c(1,2,3)) +
  outer(df$Condition=="B",c(4,5,6)) +
  outer(df$Condition=="C",c(0,1,0))

df[,3:5] <- df[,3:5] * multiples



回答2:


Here's a non-vectorized, but easy to understand solution:

 replaceFunction <- function(v){
   m <- as.numeric(v[3:5])
   if (v[6]=="A")
     out <- m * c(1,2,3)
   else if (v[6]=="B")
     out <- m * c(4,5,6)
   else
     out <- m
   return(out)
 }

 g <- apply(df, 1, replaceFunction)
 df[3:5] <- t(g)
 df



回答3:


Edited to reflect some notes from the comments

Assuming that Condition is a factor, you could do this:

#Modified to reflect OP's edit - the same solution works just fine
m <- matrix(c(1:6,0,1,0),3,3,byrow = TRUE)
df[,3:5] <- with(df,df[,3:5] * m[Condition,])

which makes use of fairly quick vectorized multiplication. And obviously, wrapping this in with isn't strictly necessary, it's just what popped out of my brain. Also note the subsetting comment below by Backlin.

More globally, remember that every subsetting you can do with subset you can also do with [, and crucially, [ support assignment via [<-. So if you want to alter a portion of a data frame or matrix, you can always use this type of idiom:

df[rowCondition,colCondition] <- <replacement values>

assuming of course that <replacement values> is the same dimension as your subset of df. It may work otherwise, but you will run afoul of R's recycling rules and R may kick back a warning.




回答4:


df[3:5] <- df[3:5] * t(sapply(df$Condition, function(x) if(x=="B") 4:6 else 1:3))

Or by vector multiplication

df[3:5] <- df[3:5] * (3*(df$Condition == "B") %*% matrix(1, 1, 3)
                      + matrix(1:3, nrow(df), 3, byrow=T))


来源:https://stackoverflow.com/questions/6878899/multiply-various-subsets-of-a-data-frame-by-different-vectors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!