Compute the mean of two columns in a dataframe

那年仲夏 提交于 2019-12-05 22:21:16

问题


I have a dataframe storing different values. Sample:

a$open  a$high  a$low   a$close

1.08648 1.08707 1.08476 1.08551
1.08552 1.08623 1.08426 1.08542
1.08542 1.08572 1.08453 1.08465
1.08468 1.08566 1.08402 1.08554
1.08552 1.08565 1.08436 1.08464
1.08463 1.08543 1.08452 1.08475
1.08475 1.08504 1.08427 1.08436
1.08433 1.08438 1.08275 1.08285
1.08275 1.08353 1.08275 1.08325
1.08325 1.08431 1.08315 1.08378
1.08379 1.08383 1.08275 1.08294
1.08292 1.08338 1.08271 1.08325

What I want to do, is creating a new column a$mean storing the mean of a$high and a$low for each row.

Here is how I achieved that:

highlowmean <- function(highs, lows){
  m <- vector(mode="numeric", length=0)
  for (i in 1:length(highs)){
    m[i] <- mean(highs[i], lows[i])
  }
  return(m)
}

a$mean <- highlowmean(a$high, a$low)

However I'm a bit new into R and in functionnal languages in general, so I'm pretty sure that there is a more efficient/simple way to achieve that.

How to achieve that the smartest way?


回答1:


We can use rowMeans

 a$mean <- rowMeans(a[c('high', 'low')], na.rm=TRUE)

NOTE: If there are NA values, it is better to use rowMeans

For example

 a <- data.frame(High= c(NA, 3, 2), low= c(3, NA, 0))
 rowMeans(a, na.rm=TRUE)    
 #[1] 3 3 1

and using +

 a1 <- replace(a, is.na(a), 0)
 (a1[1] + a1[2])/2
#  High
#1  1.5
#2  1.5
#3  1.0

NOTE: This is no way trying to tarnish the other answer. It works in most cases and is fast as well.




回答2:


For the mean of two numbers you don't really need any special functions:

a$mean = (a$high + a$low) / 2

For such an easy case, this avoids any conversions to matrix to use apply or rowMeans.



来源:https://stackoverflow.com/questions/33981527/compute-the-mean-of-two-columns-in-a-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!