Distinct count in Cassandra

问题

I have a table chat_matches in "cassandra". And want to get the count of all messages from_id to distinct to_id with count of rows(group by to_id).

CREATE TABLE chat_matches(
    chat_id uuid,
    from_id bigint,
    to_id bigint,
    message  text,
    message_send_time timestamp,
    PRIMARY KEY ((chat_id,from_id),message_send_time)
);

回答1:

In cassandra count(*) is a very costly operation, need to scan all the row from all the node just to give you the count and can generate timeout exception.

So Instead of using count(*) maintain a counter table like the below one :

CREATE TABLE message_counter (
    from_id bigint,
    to_id bigint,
    count counter,
    primary key((from_id, to_id ))
);

When a new message appears just increment the value of count by one.

 UPDATE message_counter SET count = count + 1 WHERE from_id = 1 AND to_id = 2;

Now you can select the message count group by from_id to to_id very efficiently

SELECT * FROM message_counter WHERE from_id = 1 AND to_id = 2;

来源：https://stackoverflow.com/questions/43183915/distinct-count-in-cassandra

标签

cassandra

count

distinct

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!