问题
I am trying to apply the Winsorize() function using lapply from the library(DescTools) package. What I currently have is;
data$col1 <- Winsorize(data$col1)
Which essentially replaces the extreme values with a value based on quantiles, replacing the below data as follows;
> data$col1
[1] -0.06775798 **-0.55213508** -0.12338265
[4] 0.04928349 **0.47524313** 0.04782829
[7] -0.05070639 **-112.67126382** 0.12657896
[10] -0.12886632
> Winsorize(data$col1)
[1] -0.06775798 **-0.37884540** -0.12338265 0.04928349
[5] **0.26038103** 0.04782829 -0.05070639 **-0.37884540**
[9] 0.12657896 -0.12886632
I have a for loop which can do this across all columns of the data.frame col1, col2, col3, col4, however, I know lapply is a better option so I am trying to incorporate it into an lapply function but cannot seem to get it working. If anybody can point me in the right direction it would be much apreciated.
The data;
data <- structure(list(EQ.TA = c(-0.0677579847115102, -0.552135083517749,
-0.123382654164705, 0.0492834931482554, 0.475243125304193, 0.0478282913638668,
-0.050706389027946, -112.671263815473, 0.126578956975704, -0.128866322940619
), NI.EQ = c(3.64670235329765, 1.66115713369585, 0.209424623633739,
0.340430636358184, -0.248411254566261, -12.1709277350516, 1.06888235737433,
0.0515582237132515, 0.177323118521857, 0.419879195374698), NI.TA = c(-0.24709320230217,
-0.917183132749265, -0.0258393659113752, 0.0167776109344148,
-0.118055740980805, -0.582114677880617, -0.0541991646381309,
-5.80913022585296, 0.0224453753901758, -0.0541082879872031),
TL.TA = c(1.06775798471151, 1.55213508351775, 1.12338265416471,
0.950716506851745, 0.524756874695807, 0.952171708636133,
1.05070638902795, 113.671263815473, 0.873421043024296, 1.12886632294062
)), .Names = c("EQ.TA", "NI.EQ", "NI.TA", "TL.TA"), row.names = c(NA,
10L), class = "data.frame")
回答1:
You can lapply over the whole data.frame and reassign it like:
library(DescTools)
data[]<-lapply(data, Winsorize)
data
# EQ.TA NI.EQ NI.TA TL.TA
#1 -0.06775798 2.75320700 -0.24709320 1.0677580
#2 -0.55213508 1.66115713 -0.91718313 1.5521351
#3 -0.12338265 0.20942462 -0.02583937 1.1233827
#4 0.04928349 0.34043064 0.01677761 0.9507165
#5 0.31834425 -0.24841125 -0.11805574 0.6816558
#6 0.04782829 -6.80579532 -0.58211468 0.9521717
#7 -0.05070639 1.06888236 -0.05419916 1.0507064
#8 -62.21765589 0.05155822 -3.60775403 63.2176559
#9 0.12657896 0.17732312 0.01989488 0.8734210
#10 -0.12886632 0.41987920 -0.05410829 1.1288663
来源:https://stackoverflow.com/questions/50142807/winsorizing-across-all-columns-in-a-data-frame-r-using-lapply