cassandra-2.0

copy one table to another in cassandra

谁说胖子不能爱 提交于 2019-11-29 15:58:11
i want to copy data from standardevents to standardeventstemp.. below steps i am doing COPY events.standardevents (uuid, data, name, time, tracker, type, userid) TO 'temp.csv'; truncate standardevents; COPY event.standardeventstemp (uuid, data, name, time, tracker, type, userid) FROM 'temp.csv'; but i am getting below error after 3rd step Bad Request: Invalid STRING constant (3a1ccec0-ef77-11e3-9e56-22000ae3163a) for name of type uuid aborting import at column #0, previously inserted values are still present. can anybody explain the cause of this error and how can i resolve this datatype of

Range query on secondary index in cassandra

徘徊边缘 提交于 2019-11-29 15:01:02
问题 I am using cassandra 2.1.10. So First I will clear that I know secondary index are anti-pattern in cassandra.But for testing purpose I was trying following: CREATE TABLE test_topology1.tt ( a text PRIMARY KEY, b timestamp ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io

How Cassandra select the node to send request?

我是研究僧i 提交于 2019-11-29 14:38:28
问题 Imagine a Cassandra cluster needs to be accessed by a client application. In Java api we create a cluster instance and send the read or write request via a Session. If we use read/write consistency ONE, how the api select the actual node (coordinator node) in order to forward the request. Is it a random selection? please help to figure this out. 回答1: Cassandra drivers use the "gossip" protocol (and a process called node discovery) to gain information about the cluster. If a node becomes

Cassandra Java driver: how many contact points is reasonable?

爱⌒轻易说出口 提交于 2019-11-28 21:26:08
In Java I connect to Cussandra cluster as this: Cluster cluster = Cluster.builder().addContactPoints("host-001","host-002").build(); Do I need to specify all hosts of the cluster in there? What If I have a cluster of 1000 nodes? Do I randomly choose few? How many, and do I really do that randomly? I would say that configuring your client to use the same list of nodes as the list of seed nodes you configured Cassandra to use will give you the best results. As you know Cassandra nodes use the seed nodes to find each other and discover the topology of the ring. The driver will use only one of the

how do i know if nodetool repair is finished

此生再无相见时 提交于 2019-11-28 19:33:19
I have a 2 node apache cassandra (2.0.3) cluster with rep factor of 1. I change rep factor to 2 using the following command in cqlsh ALTER KEYSPACE "mykeyspace" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 2 }; I then tried to run recommended "nodetool repair" after doing this type of alter. The problem is that this command sometimes finishes very quickly. When it does finishes like that it will normally say 'Lost notification...' and exit code is not zero. So I just repeat this 'nodetool repair' until it finishes without error. I also check that 'nodetool status'

How does cassandra find the node that contains the data?

北慕城南 提交于 2019-11-28 16:46:23
问题 I've read quite a few articles and a lot of question/answers on SO about Cassandra but I still can't figure out how Cassandra decides which node(s) to go to when it's reading the data. First, some assumptions about an imaginary cluster: Replication Strategy = simple Using Random Partitioner Cluster of 10 nodes Replication Factor of 5 Here's my understanding of how writes work based on various Datastax articles and other blog posts I've read: Client sends the data to a random node The "random"

insert speed in mysql vs cassandra

强颜欢笑 提交于 2019-11-28 14:19:17
I have a lot of (about 1 million in second)structural data that must be insert to database I see a lot of benchmark about sql vs noSql and type of Nosql then collect cassandra as database but I create a benchmark to test mysql vs cassandra in write/update/select speed mysql have better performance in my benchmark, I want to know what is my mistake?? php use as programming language YACassandraPDO and cataloniaframework use as php driver and PDO use as mysql driver my server is centOS 6.5 with 2 core CPU and 2GB RAM, mysql and cassandra have default configuration detail of benchmark: cassandra

Use of Order by clause in cassandra

浪子不回头ぞ 提交于 2019-11-28 13:41:42
When creating table in cassandra, we can give clustering keys with ordering like below. Create table user(partitionkey int, id int, name varchar, age int, address text, insrt_ts timestamp, Primary key(partitionkey, name, insrt_ts, id) with clustering order by (name asc, insrt_ts desc, id asc); when we insert data into that table, As per cassandra documentation records are sorted based on clustering keys. When i retrieve records with CQL1 and CQL2, I am getting in the same sorted order. CQL1: Select * from user where partitionkey=101; CQL2: Select * from user where partitionkey=101 order by

How do atomic batches work in Cassandra?

↘锁芯ラ 提交于 2019-11-28 08:21:14
How can atomic batches guarantee that either all statements in a single batch will be executed or none? In order to understand how batches work under the hood, its helpful to look at the individual stages of the batch execution. The client Batches are supported using CQL3 or modern Cassandra client APIs. In each case you'll be able to specify a list of statements you want to execute as part of the batch, a consistency level to be used for all statements and an optional timestamp. You'll be able to batch execute INSERT, DELETE and UPDATE statements. If you choose not to provide a timestamp, the

Querying Cassandra by a partial partition key

谁说我不能喝 提交于 2019-11-28 07:45:56
In Cassandra, I can create a composite partition key, separate from my clustering key: CREATE TABLE footable ( column1 text, column2 text, column3 text, column4 text, PRIMARY KEY ((column1, column2)) ) As I understand it, quering by partition key is an extremely efficient (the most efficient?) method for retrieving data. What I don't know, however, is whether it's also efficient to query by only part of a composite partition key. In MSSQL, this would be efficient, as long as components are included starting with the first (column1 instead of column2, in this example). Is this also the case in