How do I scale a series such that the first number in the series is 0 and last number is 1. I looked into 'approx', 'scale' but they do not achieve this objective.
# generate series from exponential distr
s = sort(rexp(100))
# scale/interpolate 's' such that it starts at 0 and ends at 1?
# approx(s)
# scale(s)
It's straight-forward to create a small function to do this using basic arithmetic:
s = sort(rexp(100))
range01 <- function(x){(x-min(x))/(max(x)-min(x))}
range01(s)
[1] 0.000000000 0.003338782 0.007572326 0.012192201 0.016055006 0.017161145
[7] 0.019949532 0.023839810 0.024421602 0.027197168 0.029889484 0.033039408
[13] 0.033783376 0.038051265 0.045183382 0.049560233 0.056941611 0.057552543
[19] 0.062674982 0.066001242 0.066420884 0.067689067 0.069247825 0.069432174
[25] 0.070136067 0.076340460 0.078709590 0.080393512 0.085591881 0.087540132
[31] 0.090517295 0.091026499 0.091251213 0.099218526 0.103236344 0.105724733
[37] 0.107495340 0.113332392 0.116103438 0.124050331 0.125596034 0.126599323
[43] 0.127154661 0.133392300 0.134258532 0.138253452 0.141933433 0.146748798
[49] 0.147490227 0.149960293 0.153126478 0.154275371 0.167701855 0.170160948
[55] 0.180313542 0.181834891 0.182554291 0.189188137 0.193807559 0.195903010
[61] 0.208902645 0.211308713 0.232942314 0.236135220 0.251950116 0.260816843
[67] 0.284090255 0.284150541 0.288498370 0.295515143 0.299408623 0.301264703
[73] 0.306817872 0.307853369 0.324882091 0.353241217 0.366800517 0.389474449
[79] 0.398838576 0.404266315 0.408936260 0.409198619 0.415165553 0.433960390
[85] 0.440690262 0.458692639 0.464027428 0.474214070 0.517224262 0.538532221
[91] 0.544911543 0.559945121 0.585390414 0.647030109 0.694095422 0.708385079
[97] 0.736486707 0.787250428 0.870874773 1.000000000
The scales
package has a function that will do this for you: rescale
.
library("scales")
rescale(s)
By default, this scales the given range of s
onto 0 to 1, but either or both of those can be adjusted. For example, if you wanted it scaled from 0 to 10,
rescale(s, to=c(0,10))
or if you wanted the largest value of s
scaled to 1, but 0 (instead of the smallest value of s
) scaled to 0, you could use
rescale(s, from=c(0, max(s)))
Alternatively:
scale(x,center=min(x),scale=diff(range(x)))
(untested)
This should do it:
reshape::rescaler.default(s, type = "range")
EDIT
I was curious about the performance of the two methods
> system.time(replicate(100, range01(s)))
user system elapsed
0.56 0.12 0.69
> system.time(replicate(100, reshape::rescaler.default(s, type = "range")))
user system elapsed
0.53 0.18 0.70
Extracting the raw code from reshape::rescaler.default
range02 <- function(x) {
(x - min(x, na.rm=TRUE)) / diff(range(x, na.rm=TRUE))
}
> system.time(replicate(100, range02(s)))
user system elapsed
0.56 0.12 0.68
Yields similar result.
You can also make use of the caret package which will provide you the preProcess function which is just simple like this:
preProcValues <- preProcess(yourData, method = "range")
dataScaled <- predict(preProcValues, yourData)
More details on the package help.
I created following function in r:
ReScale <- function(x,first,last){(last-first)/(max(x)-min(x))*(x-min(x))+first}
Here, first is start point, last is end point.
来源:https://stackoverflow.com/questions/5468280/scale-a-series-between-two-points