cassandra-2.0 | 易学教程

copy one table to another in cassandra

阅读更多关于 copy one table to another in cassandra

i want to copy data from standardevents to standardeventstemp.. below steps i am doing COPY events.standardevents (uuid, data, name, time, tracker, type, userid) TO 'temp.csv'; truncate standardevents; COPY event.standardeventstemp (uuid, data, name, time, tracker, type, userid) FROM 'temp.csv'; but i am getting below error after 3rd step Bad Request: Invalid STRING constant (3a1ccec0-ef77-11e3-9e56-22000ae3163a) for name of type uuid aborting import at column #0, previously inserted values are still present. can anybody explain the cause of this error and how can i resolve this datatype of

Range query on secondary index in cassandra

阅读更多关于 Range query on secondary index in cassandra

问题 I am using cassandra 2.1.10. So First I will clear that I know secondary index are anti-pattern in cassandra.But for testing purpose I was trying following: CREATE TABLE test_topology1.tt ( a text PRIMARY KEY, b timestamp ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io

How Cassandra select the node to send request?

阅读更多关于 How Cassandra select the node to send request?

问题 Imagine a Cassandra cluster needs to be accessed by a client application. In Java api we create a cluster instance and send the read or write request via a Session. If we use read/write consistency ONE, how the api select the actual node (coordinator node) in order to forward the request. Is it a random selection? please help to figure this out. 回答1: Cassandra drivers use the "gossip" protocol (and a process called node discovery) to gain information about the cluster. If a node becomes

Cassandra Java driver: how many contact points is reasonable?

阅读更多关于 Cassandra Java driver: how many contact points is reasonable?

In Java I connect to Cussandra cluster as this: Cluster cluster = Cluster.builder().addContactPoints("host-001","host-002").build(); Do I need to specify all hosts of the cluster in there? What If I have a cluster of 1000 nodes? Do I randomly choose few? How many, and do I really do that randomly? I would say that configuring your client to use the same list of nodes as the list of seed nodes you configured Cassandra to use will give you the best results. As you know Cassandra nodes use the seed nodes to find each other and discover the topology of the ring. The driver will use only one of the

how do i know if nodetool repair is finished

阅读更多关于 how do i know if nodetool repair is finished

I have a 2 node apache cassandra (2.0.3) cluster with rep factor of 1. I change rep factor to 2 using the following command in cqlsh ALTER KEYSPACE "mykeyspace" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; I then tried to run recommended "nodetool repair" after doing this type of alter. The problem is that this command sometimes finishes very quickly. When it does finishes like that it will normally say 'Lost notification...' and exit code is not zero. So I just repeat this 'nodetool repair' until it finishes without error. I also check that 'nodetool status'

How does cassandra find the node that contains the data?

阅读更多关于 How does cassandra find the node that contains the data?

问题 I've read quite a few articles and a lot of question/answers on SO about Cassandra but I still can't figure out how Cassandra decides which node(s) to go to when it's reading the data. First, some assumptions about an imaginary cluster: Replication Strategy = simple Using Random Partitioner Cluster of 10 nodes Replication Factor of 5 Here's my understanding of how writes work based on various Datastax articles and other blog posts I've read: Client sends the data to a random node The "random"

insert speed in mysql vs cassandra

阅读更多关于 insert speed in mysql vs cassandra

I have a lot of (about 1 million in second)structural data that must be insert to database I see a lot of benchmark about sql vs noSql and type of Nosql then collect cassandra as database but I create a benchmark to test mysql vs cassandra in write/update/select speed mysql have better performance in my benchmark, I want to know what is my mistake?? php use as programming language YACassandraPDO and cataloniaframework use as php driver and PDO use as mysql driver my server is centOS 6.5 with 2 core CPU and 2GB RAM, mysql and cassandra have default configuration detail of benchmark: cassandra

Use of Order by clause in cassandra

阅读更多关于 Use of Order by clause in cassandra

When creating table in cassandra, we can give clustering keys with ordering like below. Create table user(partitionkey int, id int, name varchar, age int, address text, insrt_ts timestamp, Primary key(partitionkey, name, insrt_ts, id) with clustering order by (name asc, insrt_ts desc, id asc); when we insert data into that table, As per cassandra documentation records are sorted based on clustering keys. When i retrieve records with CQL1 and CQL2, I am getting in the same sorted order. CQL1: Select * from user where partitionkey=101; CQL2: Select * from user where partitionkey=101 order by

How do atomic batches work in Cassandra?

阅读更多关于 How do atomic batches work in Cassandra?

How can atomic batches guarantee that either all statements in a single batch will be executed or none? In order to understand how batches work under the hood, its helpful to look at the individual stages of the batch execution. The client Batches are supported using CQL3 or modern Cassandra client APIs. In each case you'll be able to specify a list of statements you want to execute as part of the batch, a consistency level to be used for all statements and an optional timestamp. You'll be able to batch execute INSERT, DELETE and UPDATE statements. If you choose not to provide a timestamp, the

Querying Cassandra by a partial partition key

阅读更多关于 Querying Cassandra by a partial partition key

In Cassandra, I can create a composite partition key, separate from my clustering key: CREATE TABLE footable ( column1 text, column2 text, column3 text, column4 text, PRIMARY KEY ((column1, column2)) ) As I understand it, quering by partition key is an extremely efficient (the most efficient?) method for retrieving data. What I don't know, however, is whether it's also efficient to query by only part of a composite partition key. In MSSQL, this would be efficient, as long as components are included starting with the first (column1 instead of column2, in this example). Is this also the case in