datastax | 易学教程

Performance degradation with Datastax Cassandra when using multiple map types in a table

阅读更多关于 Performance degradation with Datastax Cassandra when using multiple map types in a table

问题 I have the following table with five map type collections. The max number of elements in the collection is 12 and the maximum size of the item is 50 Bytes. # CREATE TABLE persons ( treeid int, personid bigint, birthdate text, birthplace text, clientnote text, clientnoteisprivate boolean, confidence int, connections map<int, bigint>, createddate timestamp, deathdate text, deathplace text, familyrelations map<text, text>, flags int, gender text, givenname text, identifiers map<int, text>,

Iterating a GraphTraversal with GraphFrame causes UnsupportedOperationException Row to Vertex conversion

阅读更多关于 Iterating a GraphTraversal with GraphFrame causes UnsupportedOperationException Row to Vertex conversion

问题 The following GraphTraversal<Row, Edge> traversal = gf().E().hasLabel("foo").limit(5); while (traversal.hasNext()) {} causes the following Exception: java.lang.UnsupportedOperationException: Row to Vertex conversion is not supported: Use .df().collect() instead of the iterator at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator$lzycompute(DseGraphTraversal.scala:92) at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator(DseGraphTraversal.scala:78) at com

Pig script to read Cassandra table

阅读更多关于 Pig script to read Cassandra table

问题 Trying to write a Pig script that will extract data from a Cassandra table. The Pig script looks like this: REGISTER ./cassandra-all-2.0.8.39.jar REGISTER ./datastax-agent-4.1.4-standalone.jar REGISTER ./cassandra-driver-core-2.0.2.1.jar REGISTER ./apache-cassandra-thrift-2.0.12.jar A = LOAD 'cql://username:password/mykeyspace/mycolumnfamily' USING org.apache.cassandra.hadoop.pig.CqlStorage() AS (user_id:long, fname:chararray, last_update_date:chararray, lname:chararray); DUMP A; I keep

How can I optimize a Cassandra queue-like column family?

阅读更多关于 How can I optimize a Cassandra queue-like column family?

问题 I have a queue-like column family which updates frequently around every hour. After a couple of hours or a day cassandra has a lot of read time outs. I have tried this but haven't gotten the result yet: gc_grace_seconds = 0 and using LeveledCompaction. Or would you recommend the datetieredcompactionstrategy or is there another better strategy then these two? If I cannot solve this I am thinking switching to another database do you think that is necessary? Thanks for your replies. 回答1: What

Datastax Devcenter 1.1 fail to start

阅读更多关于 Datastax Devcenter 1.1 fail to start

问题 I'm using 64 bit Windows 8. It was fine yesterday, and it just failed to start. It shows the loading screen, but it just stop right there. Anyone has the same problem? Any fix? It has happened to me twice with my old pc, which were using 32 bit Windows 7. 回答1: While I haven't heard of this issue before, here's something that will hopefully fix the issue: find a directory called .devcenter in your user directory (that should be \Users\<youruser> ) move this directory to a different location

Random failure of creating a New Cassandra Cluster using OpsCenter

阅读更多关于 Random failure of creating a New Cassandra Cluster using OpsCenter

问题 OpsCenter version: 5.1.0 and DSE Version: 4.6.0 Creating a brand new cluster by using OpsCenter directly, gives us the following error. It randomly works with the same settings but 95% of the times it fails with the same error. Opscenter is running on its own box but sharing the same Security groups as the cluster instances. For good measure, I have opened up all TCP ports to all IPs. The following is the stack trace of the error from the opscenterd.log: *2015-03-19 10:06:12+0000 [] INFO:

Cassandra - Batch too large

阅读更多关于 Cassandra - Batch too large

问题 I have a list of Products which have to be added to a Purchase Order. The Purchase order has a sequence number and once the Products are added, their status should be changed to indicate that these are out for purchase. The typical number of Products being processed in 1 Purchase Order would be 500. On the DB - I have 2 tables -> 1 for Products and another for Purchase Orders. Which means I need 500 updates and 1 insert to be done. When I try to do this in a BatchStatement I get the error -

Datastax - Cassandra php-driver: Uncaught Cassandra\Exception\LogicException:

阅读更多关于 Datastax - Cassandra php-driver: Uncaught Cassandra\Exception\LogicException:

问题 I'm on Ubuntu right now trying to connect to Cassandra with PHP. I have installed Datastax php-driver and all of its dependencies but i get this error when trying to run a testfile: PHP Fatal error: Uncaught Cassandra\Exception\LogicException: Not implemented in /home/philip/Documents/test.php:3 Stack trace: #0 /home/user/Documents/test.php(3): Cassandra\Cluster\Builder->build() #1 {main} thrown in /home/user/Documents/test.php on line 3 The code looks like this: <?php // Connect to the

Datastax Cassandra PIG Running only one MAP

阅读更多关于 Datastax Cassandra PIG Running only one MAP

问题 I am using Datastax Cassandra 3.1.4 with two nodes. I am running pig with CqlStorage() with 12million rows in the table, but I find there is only one map running for a simple pig command. I tried changing split_size in my pig relation but it didn't worked. Here is my sample query. x = load'cql://Mykeyspace/MyCF?split_size=1000' using CqlStorage(); y = limit x 500; dump y I didn't find input.split.size property in my mapred-site.xml I am assuming default split size is 64*1024 I tried set pig

How many Max number of tables I can create in a given Cassandra cluster ? Is there a limit?

阅读更多关于 How many Max number of tables I can create in a given Cassandra cluster ? Is there a limit?

问题 Lets say I'm having 6 node cluster having m4.2xl (~ 8CPU 32GB RAM) - How many max tables I can create across keyspaces? and Is there a limit on max tables for a given Keyspace? Highly Appreciate your response! 回答1: There could be performance degradation when you have too many tables in the cluster. For every table you need to allocate an additional memory, etc. independent if anybody writes into it or not. From DataStax documentation: The table thresholds have additional dependencies on JVM