Cassandra SELECT DISTINCT and timeout issue

白昼怎懂夜的黑 提交于 2020-02-08 06:18:04

问题


When running the following CQL query:

SELECT DISTINCT partition_key FROM table_name;

This is supposedly meant to return the list of partition keys that are in use for the given table. However, with the default timeout settings of 10s, it always times out:

ReadTimeout: Error from server: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

Changing the timeout settings to:

read_request_timeout_in_ms: 60000
range_request_timeout_in_ms: 60000
request_timeout_in_ms: 60000

And then running said query results in several Cassandra nodes crashing, including the coordinator node. The table has approximately >100M rows with about 5000 unique partition keys.

Is there a workaround to find the unique list of partition keys?


回答1:


This query should work fine on modern versions of cassandra (2.1 and newer) assuming you're using a client that supports paging/fetch-size, and use a sufficiently low fetch-size (the actual limit depends on your server load).

Using a third party driver, look for an option to drop the page/fetch size. Set it to 100 and see if it behaves better.

Using cqlsh, if you have cassandra 3.0 or newer, try PAGING 100;




回答2:


There is another way of getting the list of keys using either of the following utilities:

sstabledump -e 
     OR
$ bin/sstablekeys <sstable_name>

But you need to run them across all nodes data directory and manually filter for distinct keys. Not straightforward but doable!

Here is the reference for the utilities Cassandra SSTabledump and Cassandra SSTablekeys

The reason for query timeout is

  1. No where clause in the query
  2. Too many rows to scan through > 100M
  3. Coordinator now has to keep the query open till gets responses from every node in the cluster and then filter for distinct.
  4. The distinct operation is simply too costly for this usecase.
  5. The nodes crash because essentially they fill up the heap with the entire rows being selected and cause OutOfMemory (OOM errors)


来源:https://stackoverflow.com/questions/44910671/cassandra-select-distinct-and-timeout-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!