boost multi_index_container search for records that fall within intervals defined by two fields

问题

Consider the following table:

Search for records that have x, where F1 <= x && x <= F2. For example, searching for records with x = 10 would yield records with id's 0,1,3,7.

How do you implement this in C++ using boost multi_index_container to avoid linear searching?

回答1:

You're trying to intersect intervals. Why don't you use a representation that fits the domain? See e.g. http://www.boost.org/doc/libs/1_60_0/libs/icl/doc/html/index.html

One other way to do it would seem to use a geometrical query. You could represent the intervals as line segments (or boxes, for more dimensions) and query them. See e.g. http://www.boost.org/doc/libs/1_60_0/libs/geometry/doc/html/geometry/reference/spatial_indexes.html

Live On Coliru

@cv_and_he: Yes, that is what I meant with the rtree approach:

#include <iostream>
#include <initializer_list>
#include <boost/geometry.hpp>
#include <boost/geometry/geometries/geometries.hpp>

#include <boost/range/algorithm.hpp>
#include <boost/range/adaptor/map.hpp>

struct record {
    int id;
    int f1, f2;
};

namespace bg  = boost::geometry;
namespace bgi = boost::geometry::index;
typedef bg::model::point<int, 2, bg::cs::cartesian> point;
typedef bg::model::segment<point> segment;

struct interval_finder
{
    typedef std::pair<segment, int> value;
    bgi::rtree<value, bgi::quadratic<16> > rt;

    interval_finder(std::initializer_list<record> l) {
        for(const record& rec : l)
            rt.insert(value { {point(rec.f1), point(rec.f2)}, rec.id});
    }

    auto find(int val) const {
        return boost::copy_range<std::vector<int> >(rt 
                | bgi::adaptors::queried(bgi::intersects(point(val))) 
                | boost::adaptors::map_values
            );
    }
};

int main() {
    interval_finder finder{
        { 0, 0,  10 },
        { 1, 5,  20 },
        { 2, 20, 30 },
        { 3, 8,  13 },
        { 4, 13, 17 },
        { 5, 50, 65 },
        { 6, 15, 26 },
        { 7, 8,  15 }
    };

    for (auto& p : finder.rt)
        std::cout << p.second << "\t" << bg::wkt(p.first) << "\n";

    for(auto range : finder.find(10))
        std::cout << range << " ";
}

For demonstration purposes, I print the elements of the index, so you can understand how it represents the intervals as segments:

0   LINESTRING(0 0,10 0)
1   LINESTRING(5 0,20 0)
2   LINESTRING(20 0,30 0)
3   LINESTRING(8 0,13 0)
4   LINESTRING(13 0,17 0)
5   LINESTRING(50 0,65 0)
6   LINESTRING(15 0,26 0)
7   LINESTRING(8 0,15 0)
0 1 3 7

回答2:

You're going to need a lot of data to make this worthwhile, but I think this is what you're asking for:

#include <utility>
#include <vector>
#include <iostream>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/indexed_by.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/tag.hpp>
#include <boost/multi_index/member.hpp>

struct record
{
    int id;
    int f1, f2;
};

struct record_fs_extractor
{
    using result_type = std::pair<int, int>;

    constexpr result_type operator ()(record const& r) const noexcept
    {
        return {r.f1, r.f2};
    }
};

namespace bmi = boost::multi_index;
using records_t = bmi::multi_index_container<
    record,
    bmi::indexed_by<
        bmi::ordered_unique<bmi::member<record, int, &record::id>>,
        bmi::ordered_non_unique<bmi::tag<struct record_fs_tag>, record_fs_extractor>
    >
>;

std::vector<int> find_ids(records_t const& records, int const x)
{
    // second index's interface is like std::map<std::pair<int, int>, record>,
    // except that duplicate keys are allowed           f1--^    ^--f2
    auto const& f_idx = records.get<record_fs_tag>();
    auto it = f_idx.lower_bound(std::make_pair(f_idx.cbegin()->f1, x));
    auto const it_end = f_idx.cend();

    std::vector<int> ret;
    while (it != it_end && it->f1 <= x)
    {
        if (x <= it->f2)
            ret.push_back(it++->id);
        else
            it = f_idx.lower_bound(std::make_pair(it->f1, x));
    }
    return ret;
}

int main()
{
    records_t const records
    {
        { 0, 0,  10 },
        { 1, 5,  20 },
        { 2, 20, 30 },
        { 3, 8,  13 },
        { 4, 13, 17 },
        { 5, 50, 65 },
        { 6, 15, 26 },
        { 7, 8,  15 }
    };

    // default index's interface is like std::map<int, record>
    std::cout << "all, ordered by id:\n"; //   id--^
    for (auto const& r : records)
        std::cout << '\t' << r.id << ": " << r.f1 << ", " << r.f2 << '\n';

    std::cout << "\nreturned by find_ids(10):";
    for (auto const& id : find_ids(records, 10))
        std::cout << ' ' << id;
    std::cout << '\n';
}

Online Demo

Output:

all, ordered by id:
        0: 0, 10
        1: 5, 20
        2: 20, 30
        3: 8, 13
        4: 13, 17
        5: 50, 65
        6: 15, 26
        7: 8, 15

returned by find_ids(10): 0 1 3 7

IFF finding the first record happens to be so much of a bottleneck that it's worth the memory-cost of a third index, then here is an alternative implementation to address that.

来源：https://stackoverflow.com/questions/34678113/boost-multi-index-container-search-for-records-that-fall-within-intervals-define

标签

c++

boost

boost-multi-index