Select a specific record in Cassandra using cql

拜拜、爱过 提交于 2020-01-05 07:31:24

问题


This is the schema I use:

CREATE TABLE playerInfo (
key text,
column1 bigint,
column2 bigint,
column3 bigint,
column4 bigint,
column5 text,
value bigint,
PRIMARY KEY (key, column1, column2, column3, column4, column5)
)   
WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};

Note I use a composite key. And there is a record like this:

 key      | column1    | column2 | column3  | column4 | column5           | value
----------+------------+---------+----------+---------+--------------------------------------------------+-------
 Kitty    | 1411       |       3 | 713      |       4 |   American        |     1

In cqlsh, how to select it? I try to use:

cqlsh:game> SELECT * FROM playerInfo WHERE KEY = 'Kitty' AND column5 = 'American';

but the output is:

Bad Request: PRIMARY KEY part column5 cannot be restricted (preceding part column4 is either not restricted or by a non-EQ relation)

Then how could I select such cell?


回答1:


You have choosen the primary key as PRIMARY KEY (key, column1, column2, column3, column4, column5) so if you are going to give where clause on column5 then you should also need to specify the where clause of key, column1, column2, column3, column4. for eg,

SELECT * FROM playerInfo WHERE KEY = 'Kitty' AND column1 = 1411 AND column2 = 3 AND column3 = 713 AND column4 = 4 AND column5 = 'American';

If you are going to give where clause on column2 then you should also need to specify the where clause of key, column1. for eg,

SELECT * FROM playerInfo WHERE KEY = 'Kitty' AND column1 = 1411 AND column2 = 3;

If you want to specify where clause on a particular column of primary key, then where clause of previous column also need to be given. So you need to choose the cassandra data modelling in a tricky way to have a good read and write performance and to satisfy your business needs too. But however if business logic satisfies you, then cassandra performance will not satisfies you. If cassandra performance satisfies you, then your business logic will not satisfies you. That is the beauty of cassandra. Sure cassandra needs more to improve.




回答2:


There is a way to select rows based on columns that are not a part of the primary key by creating secondary index. Let me explain this with an example.

In this schema:

CREATE TABLE playerInfo (
    player_id int,
    name varchar,
    country varchar,
    age int,
    performance int,
    PRIMARY KEY ((player_id, name), country)
);

the first part of the primary key i.e player_id and name is the partition key. The hash value of this will determine which node in the cassandra cluster this row will be written to.

Hence we need to specify both these values in the where clause to fetch a record. For example

SELECT * FROM playerinfo WHERE player_id = 1000 and name = 'Mark B';

 player_id | name   | country | age | performance
-----------+--------+---------+-----+-------------
      1000 | Mark B |     USA |  26 |           8

If the second part of your primary key contains more than 2 columns you would have to specify values for all the columns on the left hand side of they key including that column.

In this example

PRIMARY KEY ((key, column1), column2, column3, column4, column5)

For filtering based on column3 you would have to specify values for "key, column1, column2 and column3". For filtering based on column5 you need to sepcify values for "key, column1, column2, column3, column4, and column5".

But if your application demands using filtering on a particular columns which are not a part of the partition key you could create secondary indices on those columns.

To create an index on a column use the following command

CREATE INDEX player_age on playerinfo (age) ;

Now you can filter columns based on age.

SELECT * FROM playerinfo where age = 26;

 player_id | name    | country | age | performance
-----------+---------+---------+-----+-------------
      2000 | Sarah L |      UK |  26 |          24
      1000 |  Mark B |     USA |  26 |           8

Be very careful about using index in Cassandra. Use this only if a table has few records or more precisely few distinct values in those columns.

Also you can drop an index using

DROP INDEX player_age ;

Refer http://wiki.apache.org/cassandra/SecondaryIndexes and http://www.datastax.com/docs/1.1/ddl/indexes for more details



来源:https://stackoverflow.com/questions/26901429/select-a-specific-record-in-cassandra-using-cql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!