cassandra | 易学教程

Do I need clock synchronisation for cassandra if only one client writes to cluster?

阅读更多关于 Do I need clock synchronisation for cassandra if only one client writes to cluster?

问题 From cassandra's documentation I got to know that cassandra uses timestamps of query to resolve conflicts between two writes and hence the clocks on all the nodes of the cluster needs to be synchronised. In my use-case we have only one client writing to the cluster and multiple clients reading from the cluster. So, if I use client-side timestamp generator (which I believe is default for version>3) do I still need to have cluster node clocks synchronised with each other? 回答1: In the context of

Cassandra could not create Java Virtual Machine

阅读更多关于 Cassandra could not create Java Virtual Machine

问题 I am on a Mac OS and I run cassandra -f and immediately this happens: [0.002s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/usr/local/apache-cassandra-3.0.10/logs/gc.log instead. Unrecognized VM option 'UseParNewGC' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.``` I have no idea why this is happening. I did the proper export CASSANDRA_HOME=/usr/local/apache-cassandra-3.0.10 export PATH=$PATH:$CASSANDRA_HOME/bin But still

How to get tombstone count for a cql query?

阅读更多关于 How to get tombstone count for a cql query?

问题 I am trying to evaluate number of tombstones getting created in one of tables in our application. For that I am trying to use nodetool cfstats. Here is how I am doing it: create table demo.test(a int, b int, c int, primary key (a)); insert into demo.test(a, b, c) values(1,2,3); Now I am making the same insert as above. So I expect 3 tombstones to be created. But on running cfstats for this columnfamily, I still see that there are no tombstones created. nodetool cfstats demo.test Average live

What is best approach to join data in spark streaming application?

阅读更多关于 What is best approach to join data in spark streaming application?

问题 Question : Essentially it means , rather than running a join of C* table for each streaming records , is there anyway to run a join for each micro-batch ( micro-batching ) of records in spark streaming ? We are almost finalized to use spark-sql 2.4.x version , datastax-spark-cassandra-connector for Cassandra-3.x version. But have one fundamental question regarding the efficiency in the below scenario. For the streaming data records(i.e. streamingDataSet ) , I need to look up for existing

Cassandra tombstones count multiple queries vs single query

阅读更多关于 Cassandra tombstones count multiple queries vs single query

问题 I've a cassandra table definition as following CREATE TABLE mytable ( colA text, colB text, timeCol timestamp, colC text, PRIMARY KEY ((colA, colB, timeCol), colC) ) WITH.... I want to know if number of tombstones would vary between following types of queries: 1. delete from mytable where colA = '...' AND colB = '...' and timeCol = 111 Above query affect multiple records, (multiple values of colC) 2. delete from mytable where colA = '...' AND colB = '...' and timeCol = 111 AND colC = '...'

What is the difference between scylla read path and cassandra read path?

阅读更多关于 What is the difference between scylla read path and cassandra read path?

问题 What is the difference between Scylla read path and Cassandra read path? When I stress Cassandra and Scylla then Scylla read performance poor by 5 times than Cassandra using 16 core and normal HDD. I expect better read performance on Scylla compared to Cassandra using normal HDD, because my company doesn't provide SSD's. Can someone please confirm, is it possible to achieve better read performance using normal HDD or not? If yes, what changes required scylla config?. Please guide me! 回答1:

What is the difference between scylla read path and cassandra read path?

阅读更多关于 What is the difference between scylla read path and cassandra read path?

Cell versioning with Cassandra

阅读更多关于 Cell versioning with Cassandra

问题 My application uses an AbstractFactory for the DAO layer so once the HBase DAO family has been implemented, It would be very great for me to create the Cassandra DAO family and see the differences from several points of view. Anyway, trying to do that, I saw Cassandra doesn't support cell versioning like HBase (and my application makes a strong usage of that) so I was wondering if there are some table design trick (or something else) to "emulate" this behaviour in Cassandra 回答1: One common

Cell versioning with Cassandra

阅读更多关于 Cell versioning with Cassandra

Is there an alternative to joinWithCassandraTable for DataFrames in Spark (Scala) when retrieving data from only certain Cassandra partitions?

阅读更多关于 Is there an alternative to joinWithCassandraTable for DataFrames in Spark (Scala) when retrieving data from only certain Cassandra partitions?

问题 When extracting small number of partitions from large C* table using RDDs, we can use this: val rdd = … // rdd including partition data val data = rdd.repartitionByCassandraReplica(keyspace, tableName) .joinWithCassandraTable(keyspace, tableName) Do we have available an equally effective approach using DataFrames? Update (Apr 26, 2017): To be more concrete, I prepared an example. I have 2 tables in Cassandra: CREATE TABLE ids ( id text, registered timestamp, PRIMARY KEY (id) ) CREATE TABLE