Finding number of elements in one vector that are less than an element in another vector

Deadly 提交于 2019-11-28 13:53:39

Assuming that a is weakly sorted increasingly, use findInterval:

a <- sort(a)
## gives points less than or equal to b[i]
findInterval(b, a)
# [1] 1 3 3 4 5
## to do strictly less than, subtract a small bit from b
## uses .Machine$double.eps (the smallest distinguishable difference)
findInterval(b - sqrt(.Machine$double.eps), a)
# [1] 0 1 3 4 4

If you're really optimising this process for large N, then you may want to remove duplicate values in b at least initially, and then you can sort and match:

bu <- sort(unique(b))
ab <- sort(c(a, bu))
ind <- match(bu, ab)
nbelow <- ind - 1:length(bu)

As we've merged a and b values into ab, the match includes all a less than the specific value of b together with all b's, so that's why we remove the cummulative count of b on the final line. I suspect this may be faster for large sets - it should be if match is internally optimised for sorted lists, which one would hope to be the case. It should then be a trivial matter to map back nbelow to your original set of bs

I don't claim this is "the best way", but it's a way. sapply applies the (anonymous) function to each element of b.

 sapply(b, function(x) sum(a < x))
 # [1] 0 1 3 4 4
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!