cassandra-2.0

high and low cardinality in Cassandra

余生颓废 提交于 2019-12-05 07:08:04
I keep coming across these terms: high cardinality and low cardinality in Cassandra . I don't understand what exactly they mean. what effects they have on queries and what is preferred. Please explain with example since that will be be easy to follow. The cardinality of X is nothing more than the number of elements that compose X. In Cassandra the partition key cardinality is very important for partitioning data. Since the partition key is responsible for the distribution of the data across the cluster, choosing a low cardinality key might lead to a situation in which your data are not

how to construct range query in cassandra?

不打扰是莪最后的温柔 提交于 2019-12-05 05:34:09
CREATE TABLE users ( userID uuid, firstname text, lastname text, state text, zip int, age int, PRIMARY KEY (userID) ); I want to construct the following queries: select * from users where age between 30 and 40 select * from users where state in "AZ" AND "WA" I know I need two more tables to do this query but I dont know how the should be? EDIT From Carlo's comments, I see this is the only possibility CREATE TABLE users ( userID uuid, firstname text, lastname text, state text, zip int, age int, PRIMARY KEY (age,zip,userID) ); Now to select Users with age between 15 and 30. this is the only

What options are there to speed up a full repair in Cassandra?

走远了吗. 提交于 2019-12-05 05:19:21
I have a Cassandra datacenter which I'd like to run a full repair on. The datacenter is used for analytics/batch processing and I'm willing to sacrifice latencies to speed up a full repair ( nodetool repair ). Writes to the datacenter is moderate. What are my options to make the full repair faster? Some ideas: Increase streamthroughput ? I guess I could disable autocompation and decrase compactionthroughput temporarily. Not sure I'd want to that, though... Additional information: I'm running SSDs but haven't spent any time adjusting cassandra.yaml for this. Full repairs are run sequentially by

Cassandra Query Failures: All host(s) tried for query failed (no host was tried)

自闭症网瘾萝莉.ら 提交于 2019-12-05 02:03:49
问题 I am not able to do queries against the Cassandra Node. I am able to make the connection to the cluster and connect. However while doing the the query, it fails Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:217) at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:44) at com.datastax.driver.core.RequestHandler

Atomic Batches in Cassandra

こ雲淡風輕ζ 提交于 2019-12-04 23:09:17
问题 What do you mean by Batch Statements are atomic in cassandra? Docs are a bit confusing in nature to be precise. Does it mean that queries are atomic across nodes in cluster? Say,for example, i have a batch with 100 queries. If the 40th query in batch fails, what happens to the 39 queries executed in the batch? I understand that there is a batchlog created under the hood and it will take care of the consistency for partial batches. Does it remove the rest of the 39 entries and provide the

How to list column families in keyspace?

女生的网名这么多〃 提交于 2019-12-04 22:27:09
How can I get list of all column families in keyspace in Cassandra using CQL 3? cqlsh> select columnfamily_name from system.schema_columnfamilies where keyspace_name = 'test'; columnfamily_name ------------------- commits foo has_all_types item_by_user test test2 user_by_item (7 rows) Or even more simply (if you are using cqlsh), switch over to your keyspace with use and then execute describe tables : cqlsh> use products; cqlsh:products> describe tables; itemmaster itemhierarchy companyitemfavorites testtable Note: The describe command is specific to cqlsh only. prayagupd CQL API supports both

High number of tombstones with TTL columns in Cassandra

一曲冷凌霜 提交于 2019-12-04 20:15:23
I have a cassandra Column Family, or CQL table with the following schema: CREATE TABLE user_actions ( company_id varchar, employee_id varchar, inserted_at timeuuid, action_type varchar, PRIMARY KEY ((company_id, employee_id), inserted_at) ) WITH CLUSTERING ORDER BY (inserted_at DESC); Basically a composite partition key that is made up of a company ID and an employee ID, and a clustering column, representing the insertion time, that is used to order the columns in reverse chronological order (newest actions are at the beginning of the row). Here's what an insert looks like: INSERT INTO user

Spring-Data-Cassandra causes XSD validation error using XML configuration

泪湿孤枕 提交于 2019-12-04 19:37:37
Heyy hello, I have some kind of error that won't affect on my project's compilation,deployment and running But it shows red mark at my configuration file for Spring-data-Cassandra also shows problem in problems menu. Can any one please tell what's the issue? I have seen same question related to spring-data-JPA and Spring-data-* but they are not helping so I am posting this one. here is error message:- The errors below were detected when validating the file "spring-tool.xsd" via the file "application-config.xml". In most cases these errors can be detected by validating "spring-tool.xsd"

Overwrite row in cassandra with INSERT, will it cause tombstone?

狂风中的少年 提交于 2019-12-04 18:51:04
问题 Writing data to Cassandra without causing it to create tombstones are vital in our case, due to the amount of data and speed. Currently we have only written a row once, and then never had the need to update the row again, only fetch the data again. Now there has been a case, where we actually need to write data, and then complete it with more data, that is finished after awhile. It can be made by either; overwrite all of the data in a row again using INSERT (all data is available), or

R and cassandra connection error

血红的双手。 提交于 2019-12-04 18:18:46
library(RJDBC) cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", list.files("/home/beyhan/Downloads/jars/",pattern="jar$",full.names=T)) casscon <- dbConnect(cassdrv, "jdbc:cassandra://localhost:9042") Output > cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", + list.files("/home/beyhan/Downloads/jars/",pattern="jar$",full.names=T)) > casscon <- dbConnect(cassdrv, "jdbc:cassandra://localhost:9042") Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], : java.lang.NoClassDefFoundError: org/apache/thrift/transport/TTransportException