Multiply rows (with row names) in one data frame with matching column names in another

天大地大妈咪最大 提交于 2019-12-14 03:56:38

问题


I have two data frames:

df1 <- data.frame(Values=c(0.01,0.05), row.names=c("X", "Y"))
df1
  Values
X   0.01
Y   0.05

df2 <-data.frame(c(0,1,1), c(1,0,0), c(1,1,1))
colnames(df2) <- c("X","Y","Z")

df2
  X Y Z
1 0 1 1
2 1 0 1
3 1 0 1

I wish to perform a rowwise operation on df2, where I multiply every column in df2 with its corresponding row in df1 and then perform a summation.

For example, for row 1 of df2, I wish to calculate:

df2 %>% rowwise %>% mutate(newVAL=(df1["X",]*df2[1,"X"])+(df1["Y",]*df2[1,"Y"]))

while excluding columns that don't match (rows in df1) or have NAs.

I have several thousands of rows in df1 and several thousands of rows and columns in df2.

Any help is much appreciated!!

PS. I have implemented this in Perl using hashes and was using the system() call to perform these calculations within an Rmarkdown document. In order to keep it completely reproducible, I am trying to redo it in R. Happy to share the Perl code if necessary.

Thanks.


回答1:


If I understand correctly, it looks like you need sweep.

df3 <- sweep(df2[, rownames(df1)], 2, t(df1), '*')
df3$total <- rowSums(df3)



回答2:


Here's an attempt in base R matching the rows to the columns between the two sets:

rowSums(
  sweep(df2,
        MARGIN=2,
        STATS=df1$Values[match(colnames(df2), rownames(df1))],
        FUN=`*`),
  na.rm=TRUE
)
#[1] 0.05 0.01 0.01



回答3:


We can also use rep to make the lengths same to multiply and then get the rowSums. It will be more efficient to use rep as it is faster

rowSums(df2[rownames(df1)] * rep(df1$Values, each = nrow(df2)))
#[1] 0.05 0.01 0.01

Or using the tidyrverse packages

library(dplyr)
library(purrr)
df2 %>% 
     select_(.dots = rownames(df1)) %>% 
     map2(df1$Values, `*`) %>%
     reduce(`+`)
#[1] 0.05 0.01 0.01

Update

If we need it as a column,

df2 %>% 
    select_(.dots = rownames(df1)) %>%
    map2(df1$Values, `*`) %>%
    reduce(`+`) %>%
    mutate(df2, total = .)
#  X Y Z total
#1 0 1 1  0.05
#2 1 0 1  0.01
#3 1 0 1  0.01


来源:https://stackoverflow.com/questions/41708426/multiply-rows-with-row-names-in-one-data-frame-with-matching-column-names-in-a

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!