Using boost multi index like relational DB

邮差的信 提交于 2019-11-29 12:47:30
sehe

I've simplified your (ridiculously complicated¹) model to:

enum TimePoints { // Lets assume t1 > t2 > t3 > t4
    t1 = 100,
    t2 = 80,
    t3 = 70,
    t4 = 20,
};

using IdType = std::string;
using Symbol = std::string;
using TimeT  = unsigned int;

struct tickerUpdateInfo {
    IdType m_id;
    Symbol m_symbol;
    TimeT  m_last_update_time;

    friend std::ostream& operator<<(std::ostream& os, tickerUpdateInfo const& tui) {
        return os << "T[" << tui.m_id << ",\t" << tui.m_symbol << ",\t" << tui.m_last_update_time << "]";
    }
} static const data[] = {
    { "CBT.151.5.T.FEED", "S1", t1 },
    { "CBT.151.5.T.FEED", "s2", t2 },
    { "CBT.151.5.T.FEED", "s3", t3 },
    { "CBT.151.5.T.FEED", "s4", t4 },
    { "CBT.151.5.T.FEED", "s5", t1 },
    { "CBT.151.8.T.FEED", "s7", t1 },
    { "CBT.151.5.Q.FEED", "s8", t3 },
};

There. We can work with that. You want an index that's primarily time based, yet you can refine for symbol/id later:

typedef bmi::multi_index_container<tickerUpdateInfo,
    bmi::indexed_by<
        bmi::ordered_non_unique<bmi::tag<struct most_active_index>,
            bmi::composite_key<tickerUpdateInfo,
                BOOST_MULTI_INDEX_MEMBER(tickerUpdateInfo, TimeT,  m_last_update_time),
                BOOST_MULTI_INDEX_MEMBER(tickerUpdateInfo, Symbol, m_symbol),
                BOOST_MULTI_INDEX_MEMBER(tickerUpdateInfo, IdType, m_id)
        > > >
    > ticker_update_info_set;

For our implementation, we don't even need to use the secondary key components, we can just write

std::map<Symbol, size_t> activity_histo(ticker_update_info_set const& tuis, TimeT since)
{
    std::map<Symbol, size_t> histo;
    auto const& index = tuis.get<most_active_index>();

    auto lb = index.upper_bound(since); // for greater-than-inclusive use lower_bound
    for (auto& rec : boost::make_iterator_range(lb, index.end()))
        histo[rec.m_symbol]++;

    return histo;
}

See it Live On Coliru.

Now if volumes get large, you could be tempted to optimize a bit using the secondary index component:

std::map<Symbol, size_t> activity_histo_ex(ticker_update_info_set const& tuis, TimeT since)
{
    std::map<Symbol, size_t> histo;
    auto const& index = tuis.get<most_active_index>();

    for (auto lb = index.upper_bound(since), end = tuis.end(); lb != end;) // for greater-than-inclusive use lower_bound
    {
        auto ub = index.upper_bound(boost::make_tuple(lb->m_last_update_time, lb->m_symbol));
        histo[lb->m_symbol] += std::distance(lb, ub);

        lb = ub;
    }

    return histo;
}

I'm not sure this would become the quicker approach (your profiler would know). See it Live On Coliru too.

Rethink the design?

TBH this whole multi index thing is likely to slow you down due to suboptimal insertion times and lack of locality-of-reference when iterating records.

I'd suggest looking at

  • a single flat_multimap ordered by update-time
  • or even a (fixed size) linear ring-buffer order by time. This would make a lot of sense since you are most likely receiving the events in increasing time order anyways, so you can just keep appending at the end (and wrap around when the history window is full). This all at once removes all need for reallocation (given that you choose an appropriate maximum capacity for the ringbuffer) as well as give you optimal cache prefetch performance traversing the list for stats.

The second approach should really get some merit once you implement the ringbuffer using Boost Lockfree's spsc_queue offering. Why? Because you can host it in shared memory:

Shared-memory IPC synchronization (lock-free)


¹ the complexity would be warranted iff your code would have been selfcontained. Sadly, it was not (at all). I had to prune it in order to get something to work. This was, obviously, after removing all line numbers :)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!