R : How to detect and fix abnormal values on plot?

亡梦爱人 提交于 2019-12-12 18:16:55

问题


I tried to use AnomalyDetectionTs() by library(AnomalyDetection) from https://github.com/twitter/AnomalyDetection and https://www.r-bloggers.com/anomaly-detection-in-r/ on my data. In my example data, there are very swing values without dropping curve (or dropping slowly like pattern) on plot more than it should be from its pattern. This function doesn't work for me. All those anomaly detected points by the function are right and normal values.

This is the result from the function :

My example data : https://raw.githubusercontent.com/ieatbaozi/R-Practicing/master/example.csv

df <- read.csv(url("https://raw.githubusercontent.com/ieatbaozi/R-Practicing/master/example.csv"),header = TRUE,stringsAsFactors = FALSE)
df$DateTime <- as.POSIXct(df$DateTime)

library(AnomalyDetection)
ADtest <- AnomalyDetectionTs(df, max_anoms=0.1, direction='both', plot=TRUE)
ADtest$plot

Here is my expected result : How to detect those abnormal data?

How to fix those values by filling most proper values? Smooth them to plot close to pattern around them and total value of all data still be the same after fixing those values.

My extra question is : Do you have any idea to find its pattern? I can you give you more information. Thank you so much for you helps.


回答1:


Here is a possible solution.

  1. Compute the mean values for small windows around each point (rolling mean)
  2. Compute the difference between the actual value and the local mean.
  3. Compute the standard deviation for all of the differences from step 2.
  4. Flag as outliers those points that are more than X standard deviations from the local mean.

Using this method, I got the points that you are looking for, together with a few others - points that are in the transition from the very low values to the very high values. You may be able to filter those out.

Code

library(zoo)        ## For rolling mean function

WindowSize = 5
HalfWidth = (WindowSize-1)/2

SD = sqrt(mean((rollmean(df$Val, WindowSize ) - 
    df$Val[-c(1:HalfWidth, (nrow(df)+1-(1:HalfWidth)))])^2))
Out = which(abs(rollmean(df$Val, WindowSize ) - 
    df$Val[-c(1:HalfWidth, (nrow(df)+1-(1:HalfWidth)))]) > 2.95*SD) + 2

plot(df, type="l")
points(df[Out,], pch=16, col="red")



来源:https://stackoverflow.com/questions/44713124/r-how-to-detect-and-fix-abnormal-values-on-plot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!