Need algorithm for fast storage and retrieval (search) of sets and subsets

前端 未结 5 862
遇见更好的自我
遇见更好的自我 2021-02-06 11:57

I need a way of storing sets of arbitrary size for fast query later on. I\'ll be needing to query the resulting data structure for subsets or sets that are already stored.

5条回答
  •  Happy的楠姐
    2021-02-06 12:17

    If I understand your needs correctly, you need a multi-state storing data structure, with retrievals on combinations of these states.

    If the states are binary (as in your examples: Has milk/doesn't have milk, has sugar/doesn't have sugar) or could be converted to binary(by possibly adding more states) then you have a lightning speed algorithm for your purpose: Bitmap Indices

    Bitmap indices can do such comparisons in memory and there literally is nothing in comparison on speed with these (ANDing bits is what computers can really do the fastest).

    http://en.wikipedia.org/wiki/Bitmap_index

    Here's the link to the original work on this simple but amazing data structure: http://www.sciencedirect.com/science/article/pii/0306457385901086

    Almost all SQL databases supoort Bitmap Indexing and there are several possible optimizations for it as well(by compression etc.):

    MS SQL: http://technet.microsoft.com/en-us/library/bb522541(v=sql.105).aspx

    Oracle: http://www.orafaq.com/wiki/Bitmap_index

    Edit: Apparently the original research work on bitmap indices is no longer available for free public access.
    Links to recent literature on this subject:

    • Bitmap Index Design Choices and Their Performance Implications
    • Bitmap Index Design and Evaluation
    • Compressing Bitmap Indexes for Faster Search Operations

提交回复
热议问题