Cassandra sorting results by count

和自甴很熟 提交于 2019-12-04 03:58:16

问题


I am recording data on users searching for various keywords. What I'd like to produce is a report of all of the unique keywords that the users have searched for, sorted in ascending and descending order by how many times each has been searched for.

Is this something that can be modeled using Cassandra, and if so what would the model look like?

Thanks!


回答1:


According to the eBay tech blog, it's not unusual to store your counter values in the key itself. So to store the number of times, Bob, Ken, and Jimmy logged into a website, a single row would look as follows:

logins: [(0001_Bob,''), (0002_Bob, ''), ..., (0010_Ken, ''), (0012_Jimmy, ''), ...]

Notice that your keys will automatically sort themselves with the highest count at the tail-end and this is close to a constant time look-up.

Note that everytime your user logs-in, a new column key is created. You'd have to keep track of the number of log-ins in another row so that you have a fast look-up for how many log-ins have occurred so far and what integer value your next key should have:

login_count: [(Bob, 2), (Ken, 10), (Jimmy, 10), ...]




回答2:


You could use each keyword as a row key, and use a counter column for each row to track the number of searches. You could then produce a report by scanning over every row and reading the counters. Cassandra won't sort the results (assuming you use the default RandomPartitioner rather than an OrderPreservingPartitioner), but given that there will presumably only be a few tens of thousands of keywords, you can easily sort them at the client.



来源:https://stackoverflow.com/questions/8864050/cassandra-sorting-results-by-count

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!