Mahalanobis distance in R

前端 未结 6 2169
灰色年华
灰色年华 2020-12-13 22:37

I have found the mahalanobis.dist function in package StatMatch (http://cran.r-project.org/web/packages/StatMatch/StatMatch.pdf) but it isn\'t doing exactly what I want. It

6条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-13 23:17

    You can wrap the function stats::mahalanobis as bellow to output a mahalanobis distance matrix (pairwise mahalanobis distances):

    # x - data frame
    # cx - covariance matrix; if not provided, 
    #      it will be estimated from the data
    mah <- function(x, cx = NULL) {
      if(is.null(cx)) cx <- cov(x)
      out <- lapply(1:nrow(x), function(i) {
        mahalanobis(x = x, 
                    center = do.call("c", x[i, ]),
                    cov = cx)
      })
      return(as.dist(do.call("rbind", out)))
    }
    

    Then, you can cluster your data and plot it, for example:

    # Dummy data
    x <- data.frame(X = c(rnorm(10, 0), rnorm(10, 5)), 
                    Y = c(rnorm(10, 0), rnorm(10, 7)), 
                    Z = c(rnorm(10, 0), rnorm(10, 12)))
    rownames(x) <- LETTERS[1:20]
    plot(x, pch = LETTERS[1:20])
    

    # Comute the mahalanobis distance matrix
    d <- mah(x)
    d
    
    # Cluster and plot
    hc <- hclust(d)
    plot(hc)
    

提交回复
热议问题