outliers

how to eliminate outlier in spotfire box plots

旧巷老猫 提交于 2019-12-30 11:15:11
问题 Thanks for your help in advance. Regards, Raj 回答1: Adding the values to MAX() values would skew the data even if it were possible. There are two hacks to do this though. Right Click > Properties > Y-Axis > set the MIN range value and MAX range values to something that would eliminate all outliers. This is really only suitable for box plots that are close in all values to each other (all percentiles) On your toolbar click Insert > Calculated Column > choose the correct data table and paste in

How to use Isolation Forest

假如想象 提交于 2019-12-30 00:53:10
问题 I am trying to detect the outliers to my dataset and I find the sklearn's Isolation Forest. I can't understand how to work with it. I fit my training data in it and it gives me back a vector with -1 and 1 values. Can anyone explain to me how it works and provide an example? How can I know that the outliers are 'real' outliers? Tuning Parameters? Here is my code: clf = IsolationForest(max_samples=10000, random_state=10) clf.fit(x_train) y_pred_train = clf.predict(x_train) y_pred_test = clf

Label or score outliers in R

孤街浪徒 提交于 2019-12-26 09:26:44
问题 I'm looking for some easy to use algorithms in R to label (outlier or not) or score (say, 7.5) outliers row-wise. Meaning, I have a matrix m that contains several rows and I want to identify rows who represent outliers compared to the other rows. m <- matrix( data = c(1,1,1,0,0,0,1,0,1), ncol = 3 ) To illustrate some more, I want to compare all the (complete) rows in the matrix with each other to spot outliers. 回答1: Here's some really simple outlier detection (using either the boxplot

Label or score outliers in R

守給你的承諾、 提交于 2019-12-26 09:26:01
问题 I'm looking for some easy to use algorithms in R to label (outlier or not) or score (say, 7.5) outliers row-wise. Meaning, I have a matrix m that contains several rows and I want to identify rows who represent outliers compared to the other rows. m <- matrix( data = c(1,1,1,0,0,0,1,0,1), ncol = 3 ) To illustrate some more, I want to compare all the (complete) rows in the matrix with each other to spot outliers. 回答1: Here's some really simple outlier detection (using either the boxplot

Label or score outliers in R

为君一笑 提交于 2019-12-26 09:25:07
问题 I'm looking for some easy to use algorithms in R to label (outlier or not) or score (say, 7.5) outliers row-wise. Meaning, I have a matrix m that contains several rows and I want to identify rows who represent outliers compared to the other rows. m <- matrix( data = c(1,1,1,0,0,0,1,0,1), ncol = 3 ) To illustrate some more, I want to compare all the (complete) rows in the matrix with each other to spot outliers. 回答1: Here's some really simple outlier detection (using either the boxplot

Removing extreme values from a Matrix in MATLAB

会有一股神秘感。 提交于 2019-12-25 03:18:19
问题 I have a matrix with x-y data points: A= [x1 , y1; x2 , y2; x3 , y3] and i want to remove selected points (rows) that their y value is above some deviation from the average. How can i do this ? Thank you, Ron 回答1: Here is what you seem to need: A(abs(A(:,2)-mean(A(:,2)))>treshold,:) = [] If you want you can let the treshold be something like 1.234*std(A(:,2)) 回答2: A(A(:,2) > mean(A(:,2) + ScaleFactor*std(A(:,2)),:) = []; ScaleFactor will depend on what your criteria is.. 来源: https:/

R: iterative outliers detection

妖精的绣舞 提交于 2019-12-25 02:58:44
问题 I have a data frame (df) as follows: V V1 V2 V3 1 A B 32 1 A C 33 1 A E 43 1 A F 22 1 A T 53 1 A N 54 1 C T 44 1 C G 11 1 C N 31 1 C D 53 1 C U 75 1 A T 53 1 A N 54 2 C T 42 2 C G 14 2 C N 35 2 C D 23 2 C U 56 What want to do I to get the outliers for each combination of (V,V1) and this is to easy to achieve with the code I have. d <- as.data.table(df) # Add a column to keep track of row numbers d[, c('row'):= list(seq_len(nrow(d)))] # For each group (combination of V and V1), perform the

Identifying Outliers using Quartile Range

。_饼干妹妹 提交于 2019-12-25 01:13:10
问题 I have a dataframe which consists of numerical values with 22 columns. When I do summary(df) on it get be details (min,max,mean,median,1 and 3rd quartiles). Now I want to get 1 and 3rd quartiles for each of the column. Anything above or below it would be an Outlier and I would like to replace the Outlier with NA value. Summary : Var 1 Var2 Var 3 Var 4 Min. : 0 Min. :0 Min : 0 Min : -127.00 1st Qu.: 1208 1st Qu.: 1150 1st Qu.: 135000 1st Qu.: 98 Median : 1400 Median : 1300 Median : 180000

Removing matrix rows if values of a cloumn are outliers

空扰寡人 提交于 2019-12-24 19:27:59
问题 There is a really cool and easy function by @aL3xa here but that is for a vector. I have a matrix, and say column 2, is a variable that I want to chop off outliers and the associated row. There is a package outliers that I would like to use its algorithms, but they seem to be for a vector too. Any suggestions? thanks 回答1: Taking from some of the code from the question you linked: # @aL3xa's function remove_outliers <- function(x, na.rm = TRUE, ...) { qnt <- quantile(x, probs=c(.25, .75), na

Replacing values in df using index

匆匆过客 提交于 2019-12-24 05:55:30
问题 I am trying to detect outliers in my dataframe and replace the outliers by NAs. I have slighty modified the function provided in here: How to repeat the Grubbs test and flag the outliers. When trying the function for a vector it works great, but my problem is when I use it on a dataframe. The function detects outliers but I do not know how to get the results as dataframe. What I want as a result is my original dataframe replaced by NA s. Where NA will be the detected outliers. This is what I