Is {Filter}ing faster than {Query}ing in Lucene?

后端 未结 4 1146
囚心锁ツ
囚心锁ツ 2020-12-13 15:15

While reading \"Lucene in Action 2nd edition\" I came across the description of Filter classes which are could be used for result filtering in Lucene. Lucene ha

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-13 16:09

    In contrast to Dennis' answer: no, you probably don't want to use a filter unless you're going to reuse the same query multiple times.

    A NumericRangeFilter is just a subclass of MultiTermQueryWrapperFilter, which means that essentially it does something like this:

    for each document in index:
       if document matches query:
          match[i] = 1
       else
          match[i] = 0
    

    So it will run in linear time over your index instead of logarithmic time like a normal query.

    Additionally, the filter will take up more memory (one bit for every doc in your index).

    If you're going to be using the same query over and over again, then it's probably worth it to you to pay the performance/memory hit once and have later usages be faster. But if it's a one-off query, it's almost certainly not worth it.

    (Also, if you're going to reuse it, use a CachingWrapperFilter so that the filter is cached.)

提交回复
热议问题