Cassandra CQL method for paging through all rows

笑着哭i 提交于 2019-12-09 01:50:34

问题


I want to programmatically examine all the rows in a large cassandra table, and was hoping to use CQL. I know I could do this with thrift, getting 10,000 (or so) rows at a time with multiget and handing the last retrieved key into to the next multiget call. But I have looked through all the documentation on CQL select, and there doesn't seem to be a way to do this. I have resorted to setting the select limit higher and higher, and setting the timeout higher and higher to match it.

Is there an undocumented way to hand in a starting point to CQL select, or do I just need to break down and rewrite my code using the thrift API?


回答1:


Turns out greater than and less than have a very non-intuitive, but useful, behavior (at least in CQL2, I haven't check CQL3 yet). It actually compares the tokens not the key values. Here is an example:

> create table users (KEY varchar PRIMARY KEY, data varchar);
> insert into users (KEY, 'data') values ('1', 'one');
> insert into users (KEY, 'data') values ('2', 'two');
> insert into users (KEY, 'data') values ('3', 'three');
> insert into users (KEY, 'data') values ('4', 'four');
> select * from users;
   3 | three
   2 |   two
   1 |   one
   4 |  four
> select * from users LIMIT 1;
   3 | three
> select * from users WHERE KEY > '3' LIMIT 1;
   2 |  two
> select * from users WHERE KEY > '2' LIMIT 1;
   1 |  one
> select * from users WHERE KEY > '1' LIMIT 1;
   4 | four



回答2:


Check this one: http://wiki.apache.org/cassandra/FAQ#iter_world

You would need to program it manually, for example each following query would need to provide starting point, which was the last result from previous query. This starting port will allow you to create slice query, which returns limited amount of results.

For example you have row with following column names:

A1,A2,A3,B1,B2,B3,B4,B5,B6,C4,C5,D1,D2,D4,E2,E23,E4,E5,E6,E7

Now you would like to iterate over it, where each response has 3 results

Slice 1) Start: "", End: "", Limit: 3 -> A1,A2,A3
Slice 2) Start: "A3", End: "", Limit: 3 -> B1,B2,B3
Slice 3) Start: "B3", End: "", Limit: 3 -> B4,B5,B6
Slice 4) Start: "B6", End: "", Limit: 3 -> C4,C5,D1



来源:https://stackoverflow.com/questions/11832886/cassandra-cql-method-for-paging-through-all-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!