cassandra | 易学教程

Cassandra - conflict resolution for mixed column updates with identical timestamp

阅读更多关于 Cassandra - conflict resolution for mixed column updates with identical timestamp

问题 I would like to know which write wins in case of two updates with the same client timestamp. Initial data: KeyA: { col1:"val AA", col2:"val BB", col3:"val CC"} Client 1 sends update: KeyA: { col1:"val C1", col2:"val B1"} Client 2 sends update: KeyA: { col1:"val C2", col2:"val B2"} Both updates have the same timestamp . What result will be returned by row query on KeyA ? { col1:"val C1", col2:"val B1", col3:"val CC"} - Clint 1 wins { col1:"val C2", col2:"val B2", col3:"val CC"} - Client 2 wins

How to programatically add index to Cassandra 0.7

阅读更多关于 How to programatically add index to Cassandra 0.7

问题 I tried to run the demo on http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes programatically, but the results are different from running it in CLI. It seems like Cassandra can only index columns after index is added. All previous data are left unindexed. Full source code are as below:- public static void main(String[] args) { try { try { transport.open(); } catch (TTransportException ex) { Logger.getLogger(IndexLaterTest.class.getName()).log(Level.SEVERE, null, ex); System

Column family ID mismatch during ALTER TABLE

阅读更多关于 Column family ID mismatch during ALTER TABLE

问题 When adding a column to a table using cqlsh , I get the following error message: ALTER TABLE table ADD dataVersion text; ServerError: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.cassandra.exceptions.ConfigurationException: Column family ID mismatch (found dbed2170-c53c-11e7-a6f8-6fd66506919d; expected db9404f0-c53c-11e7-8529-65b72ab1f7cf) What does it really mean and what should I do with it? Is it a bug? The column seems to be added successfully. Cassandra

NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

阅读更多关于 NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

问题 The setup: 2-node Cassandra 1.2.6 cluster replicas=2 very large CQL3 table with no secondary index Rowkey is a UUID.randomUUID().toString() read consistency set to ONE Using DataStax java driver 1.0 The request: Attempting to do a table scan by " SELECT some-col from schema.table LIMIT nnn; " The fail: Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver. It reads like this: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)

Querying with “contains” on a list of user defined type (UDT)

阅读更多关于 Querying with “contains” on a list of user defined type (UDT)

问题 For data model like: create type city ( name text, code int ); create table user ( id uuid, name text, cities list<FROZEN<city>>, primary key ( id ) ); create index user_city_index on user(cities); Querying as select id, cities from user where cities contains {name:'My City', code: 10}; is working fine. But is it possible to query select id, cities from user where cities contains {name:'My City'}; and discard the code attribute, i.e. code=<any> ? Can this be achieved with the utilization of

Querying with “contains” on a list of user defined type (UDT)

阅读更多关于 Querying with “contains” on a list of user defined type (UDT)

通用高效的数据修复方法：Row level repair

阅读更多关于通用高效的数据修复方法：Row level repair

导读：随着大数据的进一步发展，NoSQL 数据库系统迅速发展并得到了广泛的应用。其中，Apache Cassandra 是最广泛使用的数据库之一。对于 Cassandra 的优化是大家研究的热点，而 ScyllaDB 则为其提供了一个新的思路。ScyllaDB 是一个基于 C 的开源的高性能的 Cassandra 的实现，较之 Cassandra 在性能上有了很大的提升。Nodetool repair 是 Cassandra 日常维护的重要一环，今天主要和大家分享一下 ScyllaDB 在这方面的优化。今天的介绍会围绕下面五点展开： ScyllaDB 介绍 Row level repair 介绍 Row level repair 实现实验结果总结 ▌ScyllaDB 介绍首先给大家简单介绍一下 ScyllaDB： ScyllaDB 的产生背景我们公司是一家具有较多的底层软件开发经验的公司，团队创始人是 KVM 和 OSv 的作者。对于 Cassandra 数据库的优化，我们进行了一系列尝试。最开始是从操作系统的角度，通过提高操作系统的性能来提高 Cassandra 应用的性能，其效果是提高了 Cassandra 约20%的性能而无法再获得更高的性能提升。为了更好地优化 Cassandra，团队开始思考是否可以重新实现 Cassandra。我们首先开发了一个非常高性能的 C

Spark Cassandra connector - where clause

阅读更多关于 Spark Cassandra connector - where clause

问题 I am trying to do some analytics on time series data stored in cassandra by using spark and the new connector published by Datastax. In my schema the Partition key is the meter ID and I want to run spark operations only on specifics series, therefore I need to filter by meter ID. I would like then to run a query like: Select * from timeseries where series_id = X I have tried to achieve this by doing: JavaRDD<CassandraRow> rdd = sc.cassandraTable("test", "timeseries").select(columns).where(

Cassandra Custom Secondary Index

阅读更多关于 Cassandra Custom Secondary Index

问题 This seems to be a mystery in cassandra, According to official documentation, one can create index on a column by using a custom indexer class CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass'; But I could not find any documentation regarding the interface/class to be implemented/extended to do this and how to configure cassandra to find the class? I wanted to write a custom indexer which could skip indexing rows based on conditions/options. 回答1: Here what I've found https:/

How can I have null column value for a composite key column in CQL3

阅读更多关于 How can I have null column value for a composite key column in CQL3

问题 This may sound silly as there are no null values in SQL's composite primary key. But just want to confirm if we can have the same in CQL3? So, we have a table like this to store wide rows: CREATE TABLE keyspace12.colFamily1 ( id text, colname text, colvalue blob, PRIMARY KEY (id,colname, colvalue) ) WITH COMPACT STORAGE And we have some cases where colname is null. Can I do that? If yes, then how? If NO, then what are the ways to store wide columns rows where we can have some null in first