use multiple columns as variables with sapply

假如想象 提交于 2019-11-27 07:26:15

Try mapply():

qq <- mapply(minimum_distance, df$a, df$b, df$c)

try this:

do.call("mapply", c(list(minimum_distance), df))

but you can write vectorized version:

pminimum_distance <- function(a,b,c)
{
 dist1 <- abs(a-b)
 dist2 <- abs(a-c)
 dist3 <- abs(b-c)
 return(pmin(dist1,dist2,dist3))
}
pminimum_distance(df$a, df$b, df$c)

# or
do.call("pminimum_distance", df)

I know this has been answered but I'd actually take a different approach that takes any number of columns and is more generalizable using an outer approach:

vdiff <- function(x){
    y <- outer(x, x, "-")
    min(abs(y[lower.tri(y)]))
}

apply(df, 1, vdiff)

I think this is a little cleaner and flexible.

EDIT: Per zach's comments I propose this more formalized function that works on data frames with non numeric columns as well by removing them and acting only on the numeric columns.

cdif <- function(dataframe){
    df <- dataframe[, sapply(dataframe, is.numeric)]
    vdiff <- function(x){
        y <- outer(x, x, "-")
        min(abs(y[lower.tri(y)]))
    }
    return(apply(df, 1, vdiff))
}

#TEST it out
set.seed(10)
(df <- data.frame(a = sample(1:100, 10), b = sample(1:100, 10), 
    c = sample(1:100, 10), d =  LETTERS[1:10]))

cdif(df)

Its better to write a function and then use mapply on the vectors:

 f1 <- function(a,b,c){
 d =abs(a-b)
 e =abs(b-c)
 f= abs(c-a)
 return(pmin(d,e,f))
 }

 qq <- mapply(f1, df$a, df$b, df$c)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!