R: split string into numeric and return the mean as a new column in a data frame

后端未结

关注

 3  1563

I have a large data frame with columns that are a character string of numbers such as \"1, 2, 3, 4\". I wish to add a new column that is the average of these numbers. I have

相关标签:

3条回答

南笙

2020-12-11 09:15

Try:

library(dplyr)
library(splitstackshape)

df %>%
  mutate(index = row_number()) %>%
  cSplit("a", direction = "long") %>%
  group_by(index) %>%
  summarise(mean = mean(a))

Which gives:

#Source: local data table [3 x 2]
#
#  index mean
#1     1  2.5
#2     2  5.0
#3     3  7.5

Or as per @Ananda's suggestion:

> rowMeans(cSplit(df, "a"), na.rm = T)
# [1] 2.5 5.0 7.5

If you want to keep the result in a data frame you could do:

df %>% mutate(mean = rowMeans(cSplit(., "a"), na.rm = T))

Which gives:

#            a mean
#1  1, 2, 3, 4  2.5
#2  2, 4, 6, 8  5.0
#3 3, 6, 9, 12  7.5

0 讨论(0)

小蘑菇

2020-12-11 09:16
You could use sapply to loop through the list returned by strsplit, handling each of the list elements:
```
sapply(strsplit((df$a), split=", "), function(x) mean(as.numeric(x)))
# [1] 2.5 5.0 7.5
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

粉色の甜心

2020-12-11 09:32

library(data.table)
cols <- paste0("a",1:4)
setDT(df)[, (cols) := tstrsplit(a, ",", fixed=TRUE, type.convert=TRUE)
        ][, .(Mean = rowMeans(.SD)), .SDcols = cols]
   Mean
1:  2.5
2:  5.0
3:  7.5

Alternatively,

rowMeans(setDT(tstrsplit(df$a, ",", fixed=TRUE, type.convert=TRUE)))
# [1] 2.5 5.0 7.5

0 讨论(0)