Cassandra - WHERE clause with non primary key disadvantages

孤街醉人 提交于 2019-12-10 02:27:59

问题


I am new to cassandra and I am using it for analytics tasks (good indexing needed ).

I read in this post (and others): cassandra, select via a non primary key that I can't query my DB with a non-primary key columns with WHERE clause.

To do so, it seems that there is 3 possibilities (ALL with major disadvantages):

  • Create a secondary index (not recommended for performance issues).
  • Create a new table (I don't want redundant data even if it's ok with cassandra).
  • Put the column I want to query by within the primary key and in this case I need to define all the parts of the primary key in my WHERE clause and I can't uses other operator than IN or =.

Is there an other way to to what I am trying to do (WHERE clause with non-primary key column) without having the 3 constraints above?


回答1:


From within Cassandra itself you are limited to the options that you have specified above. If you want to know why take a look here:

A Deep Look to the CQL Where Clause

However if you are trying to run analytics on information stored within Cassandra then have you looked at using Spark. Spark is built for large scale data processing on distributed systems. In fact if you are looking at using Datastax (see here) which has some nice integration features between Spark and Cassandra specifically for loading and saving data. It has both a free (Community) and paid (Enterprise) editions.




回答2:


I assume that the table is designed for a different purpose given that the fields you want to query by are not part of the partitioning key. My suggestion would be to duplicate the table and key it by the fields you want to query it by. I would recommend designing a new table for the exact purpose you will use it for as per Data modeling concepts.

Cassandra offers several advantages such as linear scaling etc by imposing certain restrictions with respect to what you can do with CQL.




回答3:


I had a similar issue while using cassandra 2.x version, upgrade your version to cassandra 3.0 and above. This was the only solution for me.




回答4:


Please, try to use IF in your query:

UPDATE [keyspace_name.] table_name
[USING TTL time_value | USING TIMESTAMP timestamp_value]
SET assignment [, assignment] . . . 
WHERE row_specification
[IF EXISTS | IF condition [AND condition] . . .] ;

see https://docs.datastax.com/en/archived/cql/3.3/cql/cql_reference/cqlUpdate.html



来源:https://stackoverflow.com/questions/35524516/cassandra-where-clause-with-non-primary-key-disadvantages

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!