R: How to calculate mean for each row with missing values using dplyr

前端未结

关注

 3  1033

I want to calculate means over several columns for each row in my dataframe containing missing values, and place results in a new column called \'means.\' Here\'s my datafra

相关标签:

3条回答

孤独总比滥情好

2021-01-06 20:11
```
df %>% 
  mutate(means=rowMeans(., na.rm=TRUE))
```
The . is a "pronoun" that references the data frame df that was piped into mutate.
```
  A B  C    means
1 3 0  9 4.000000
2 4 6 NA 5.000000
3 5 8  1 4.666667
```
You can also select only specific columns to include, using all the usual methods (column names, indices, grep, etc.).
```
df %>% 
  mutate(means=rowMeans(.[ , c("A","C")], na.rm=TRUE))
```
```
  A B  C means
1 3 0  9     6
2 4 6 NA     4
3 5 8  1     3
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
刺人心

2021-01-06 20:18
It is simple to accomplish in base R as well:
```
cbind(df, "means"=rowMeans(df, na.rm=TRUE))
  A B  C    means
1 3 0  9 4.000000
2 4 6 NA 5.000000
3 5 8  1 4.666667
```
The rowMeans performs the calculation.and allows for the na.rm argument to skip missing values, while cbind allows you to bind the mean and whatever name you want to the the data.frame, df.
0 讨论(0)
发布评论:

提交评论
- 加载中...

死守一世寂寞

2021-01-06 20:22

Regarding the error in OP's code, we can use the concatenate function c to get those elements as a single vector and then do the mean as mean can take only a single argument.

df %>%
    rowwise() %>% 
    mutate(means = mean(c(A, B, C), na.rm = TRUE))
#     A     B     C    means 
#  <dbl> <dbl> <dbl>    <dbl>
#1     3     0     9 4.000000
#2     4     6    NA 5.000000
#3     5     8     1 4.666667

Also, we can use rowMeans with transform

transform(df, means = rowMeans(df, na.rm = TRUE))
#  A B  C    means
#1 3 0  9 4.000000
#2 4 6 NA 5.000000
#3 5 8  1 4.666667

Or using data.table

library(data.table)
setDT(df)[, means := rowMeans(.SD, na.rm = TRUE)]

0 讨论(0)