Redis or Mongo for determining if a number falls within ranges?

前端 未结 3 1848
余生分开走
余生分开走 2020-12-15 01:26

I need a way to quickly check if an IP address falls into one of many forbidden IP ranges.

I currently use iptables to check if the IP falls into a specified range.

相关标签:
3条回答
  • 2020-12-15 02:11

    Contrary to the previous poster, I don't think you can get O(log n) complexity by using naive indexing. Let's consider mongodb for instance. You could define two indexes (for start and end properties of the ranges), but mongodb will only use one to solve a given query. So it will not work. Now if you use a single compound index involving both start and end properties of the ranges, the complexity will be logarithmic to find the first range to check, but then it will get linear to find the last range matching the query. The worst case complexity is O(n), and you have it when all the stored ranges overlap your input.

    On a side note, using Redis sorted set you can emulate a sorted index (with O(log n) complexity) if you know what you do. Redis is a bit more than a simple key-value store. Redis sorted sets are implemented using a skip list, and both the score and value are used to compare items.

    To solve this kind of problem, a dedicated indexing structure is needed. You may want to have a look at:

    http://en.wikipedia.org/wiki/Segment_tree or http://en.wikipedia.org/wiki/Interval_tree

    If the concern is speed over space, then it may be interesting to flatten the index. For instance, let's consider the following ranges (using only integers to simplify the discussion):

    A 2-8
    B 4-6
    C 2-9
    D 7-10
    

    A sparse structure indexing non overlapping segments can be built.

    0  []
    2  [A C]
    4  [A C B]
    7  [A C D]
    9  [C D]
    10 [D]
    11 []
    

    Each entry contains the lower bound of a non overlapping segment as the key, and the list or set of matching ranges as a value. Entries should be indexed using a sorted container (tree, skip list, btree, etc ...)

    To find the ranges matching 5, we look for the first entry which is lower or equal to 5 (it will be 4 in this example) and the list of ranges is provided ( [A C B] )

    With this data structure, the complexity of the queries is really O(log n). However it is not trivial (and expensive) to build and maintain it. It can be implemented with both mongodb and Redis.

    Here is an example with Redis:

    > rpush range:2 2-8 2-9
    (integer) 2
    > rpush range:4 2-8 2-9 4-6
    (integer) 3
    > rpush range:7 2-8 2-9 7-10
    (integer) 3
    > rpush range:9 2-9 7-10
    (integer) 2
    > rpush range:10 7-10
    (integer) 1
    > zadd range_index 0 range:0 2 range:2 4 range:4 7 range:7 9 range:9 10 range:10
    (integer) 6
    > zrevrangebyscore range_index 5 0 LIMIT 0 1
    1) "range:4"
    > lrange range:4 0 -1
    1) "2-8"
    2) "2-9"
    3) "4-6"
    
    0 讨论(0)
  • 2020-12-15 02:13

    If you are dealing with ranges, mongodb will be better than redis for that. If you are dealing with specific IP addresses, redis will be the best choice.

    Why?

    Mongodb can build indexes on the start and end ranges of ip addresses which you can query in O(log n) time whereas redis is simply a key value store.

    You could use a redis hash if every single ip that you wanted to check against was in a hash table and look them up in O(1) time but because you are using ranges I would say mongo is your best bet. I don't think you are going to get better than log n time with any of the data structures in the redis toolkit. O(n) with sorted set or list maybe but not O(log n).

    0 讨论(0)
  • 2020-12-15 02:23

    Today we can use a Bloom filter in redis.

    0 讨论(0)
提交回复
热议问题