Sum aggregation for each columns in cassandra

后端 未结 2 819
野性不改
野性不改 2020-12-22 06:44

I have a Data model like below,

CREATE TABLE appstat.nodedata (
    nodeip text,
    timestamp timestamp,
    flashmode text,
    physicalusage int,
    read         


        
相关标签:
2条回答
  • 2020-12-22 07:24

    Disclaimer. In your question you should define restrictions to query speed. Readers do not know whether you're trying to show this in real time, or is it more for analytical purposes. It's also not clear on how much data you're operating and the answers might depend on that.

    Firstly decide whether you want to do aggregation on read or write. This largely depends on your read/write patterns.

    1) First question: (aggregation on read) The short answer is no - it's not possible. If you want to use Cassandra for this, the best approach would be doing aggregation in your application by reading each nodeip with timestamp restriction. That would be slow. But Cassandra aggregations are also potentially slow. This warning exists for a reason:

    Warnings :
    Aggregation query used without partition key
    

    I found C++ Cassandra driver to be the fastest option, if you're into that.

    If your data size allows, I'd look into using other databases. Regular old MySQL or Postgres will do the job just fine, unless you have terabytes of data. There's also influx DB if you want a more exotic one. But I'm getting off-topic here.

    2) Second question: (aggregation on write) That's the approach I've been using for a while. Whenever I need some aggregations, I would do them in memory (redis) and then flush to Cassandra. Remember, Cassandra is super efficient at writing data, don't be afraid to create some extra tables for your aggregations. I can't say exactly how to do this for your data, as it all depends on your requirements. It doesn't seem feasible to provide results for arbitrary timestamp intervals when aggregating on write.

    Just don't try to put large sets of data into a single partition. You're better of with traditional SQL databases then.

    0 讨论(0)
  • 2020-12-22 07:37

    If you are dse Cassandra you can enable spark and write the aggregation queries

    0 讨论(0)
提交回复
热议问题