Oracle: Full text search with condition

前端 未结 4 1998
栀梦
栀梦 2020-12-13 16:27

I\'ve created an Oracle Text index like the following:

create index my_idx on my_table (text) indextype is ctxsys.context; 

And I can then

4条回答
  •  不知归路
    2020-12-13 16:42

    Oracle Text

    1 - You can improve performance by creating the CONTEXT index with FILTER BY:

    create index my_idx on my_table(text) indextype is ctxsys.context filter by group_id;
    

    In my tests the filter by definitely improved the performance, but it was still slightly faster to just use a btree index on group_id.

    2 - CTXCAT indexes use "sub-indexes", and seem to work similar to a multi-column index. This seems to be the option (4) you're looking for:

    begin
      ctx_ddl.create_index_set('my_table_index_set');
      ctx_ddl.add_index('my_table_index_set', 'group_id');
    end;
    /
    
    create index my_idx2 on my_table(text) indextype is ctxsys.ctxcat
        parameters('index set my_table_index_set');
    
    select * from my_table where catsearch(text, 'blah', 'group_id = 43') > 0
    

    This is likely the fastest approach. Using the above query against 120MB of random text similar to your A and B scenario required only 18 consistent gets. But on the downside, creating the CTXCAT index took almost 11 minutes and used 1.8GB of space.

    (Note: Oracle Text seems to work correctly here, but I'm not familiar with Text and I can't gaurentee this isn't an inappropriate use of these indexes like @NullUserException said.)

    Multi-column indexes vs. index joins

    For the situation you describe in your edit, normally there would not be a significant difference between using an index on (A,B) and joining separate indexes on A and B. I built some tests with data similar to what you described and an index join required only 7 consistent gets versus 2 consistent gets for the multi-column index.

    The reason for this is because Oracle retrieves data in blocks. A block is usually 8K, and an index block is already sorted, so you can probably fit the 500 to 2000 values in a few blocks. If you're worried about performance, usually the IO to read and write blocks is the only thing that matters. Whether or not Oracle has to join together a few thousand rows is an inconsequential amount of CPU time.

    However, this doesn't apply to Oracle Text indexes. You can join a CONTEXT index with a btree index (a "bitmap and"?), but the performance is poor.

提交回复
热议问题