Find outlier using z score

不想你离开。 提交于 2021-02-10 20:23:23

问题


I am trying to create a function in R. The function should find outliers from a matrix using z score. The function should have two arguments as input (x which is a matrix and zs which is an integer). For each raw of the matrix, the function should calculate the zscore for each element and if zscore is bigger than zs or smaller than -zs, then the function should print that element. I know that I can use:

z<- (x-mean(x))/sd(x)   or  z<- scale(x) 

for the calculations of z score but as I am a beginner in programming, I find it difficult to solve the problem because of the matrix.


回答1:


How about this code:

set.seed(1)
mat <- matrix(rnorm(100), ncol=10)
temp <- abs(apply(mat, 1, scale))
mat[temp > 2]
### [1]  1.9803999  0.2670988 -1.2765922

I took 2 standard deviations for your Z limit. First i create a random matrix. Then i then scale it row by row (the '1' argument of the apply function) I apply 'abs' to avoid having to test on both sides (< and >), since the test is symetric Eventually it gives you the outlier values. But you also might want to see where they are, just do:

image(temp > 2)

enter image description here

EDIT: If you need it as a function inputting x and zs, i wrapped it:

outliers = function(x, zs) {
  temp <- abs(apply(x, 1, scale))
  return(x[temp > zs])
}

### > outliers(matrix(rnorm(100), ncol=10), 2)
### [1]  1.9803999  0.2670988 -1.2765922



回答2:


myfun <- function(x, zs) { 
    x1 <- apply(x, 1, scale)
    x2 <- (abs(x1) - abs(zs)) > 0
    return(x * x2)
}


来源:https://stackoverflow.com/questions/28866902/find-outlier-using-z-score

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!