datastax-enterprise

DatastaxEnteprise: node vs instance, correct AMI image, why do I need storage

筅森魡賤 提交于 2019-12-12 01:09:58
问题 Currently, we are evaluating datastax enteprise as our provider of Cassandra and Spark.We consider deploying Datastax cluster on AWS. I have following questions: 1) In step 1 of Datastax on EC2 installation manual, I need to choose correct AMI Image: Currently there are 7 of them. Which is the correct one: (DataStax Auto-Clustering AMI 2.5.1-pv, DataStax Auto-Clustering AMI 2.6.3-1204-pv, DataStax Auto-Clustering AMI 2.6.3-1404-pv....) 2) The moment we launch the cluster, do we pay only for

Installing dse 3.1 dependency

柔情痞子 提交于 2019-12-11 20:27:31
问题 When I run ' sudo apt-get install dse-full ', I am getting dependency/configuration issues, Full output listed below. I had a previous version of dse & opscenter installed and I did manually delete the config files located in /etc/dse earlier, which is the probably the root cause of my issue. I am relatively new to linux does anybody know what I can do, where I can look to resolve this issues? Any help would be greatly appreciated, Thanks NJF Reading package lists... Done Building dependency

MaximumRetryException when reading data off Cassandra using multiget

大兔子大兔子 提交于 2019-12-11 19:36:38
问题 I am inserting time series data with time stamp (T) as the column name in a wide column that stores 24 hours worth of data in a single row. Streaming data is written from data generator (4 instances, each with 256 threads) inserting data into multiple rows in parallel. CF2 (Wide column family): RowKey1 (T1, V1) (T2, V3) (T4, V4) ...... RowKey2 (T1, V1) (T3, V3) ..... : : I am now attempting to read this data off Cassandra using multiget. The client is written in python and uses pycassa. When

Executing a LOGGED BATCH warning in Cassandra logs

自闭症网瘾萝莉.ら 提交于 2019-12-11 18:26:23
问题 Our Java Application doing a batch inserts on 1 of the table, That table schema is something like.. CREATE TABLE "My_KeySpace"."my_table" ( key text, column1 varint, column2 bigint, column3 text, column4 boolean, value blob, PRIMARY KEY (key, column1, column2, column3, column4) ) WITH CLUSTERING ORDER BY ( column1 DESC, column2 DESC, column3 ASC, column4 ASC ) AND COMPACT STORAGE AND bloom_filter_fp_chance = 0.1 AND comment = '' AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0

connecting to cassandra nodes on a datastax cluster on EC2 Ruby on Rails

天大地大妈咪最大 提交于 2019-12-11 17:46:25
问题 I created a datastax cassandra Enterprise cluster with 2 cassandra nodes, 2 search nodes and 2 Analytics nodes. Everything seems to work correctly EXCEPT, I can't connect to it from outside. If I'm on node0 server I can run the cassandra-cli and connect to the cassandra nodes on port 9160 but when I tried to connect using datastax-rails gem, I get "No live servers" I also tried datastax devCenter which tries to connect to the native port 9042 but also didn't work. I'm really puzzled, any help

timeouts on ReadRepairStage error messages

纵饮孤独 提交于 2019-12-11 16:12:12
问题 We are using Apache Cassandra 3.11.4 .Recently we are seeing overloaded readrepair ERROR messages in the entire cluster because that we are getting timeouts ..I'm not able to find the root cause for this . Appreciate any inputs on this issue .. ERROR [ReadRepairStage:2537] 2019-07-18 17:08:15,119 CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:2537,5,main] org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 1 responses. at org.apache

How data will be consistent on cassandra cluster

允我心安 提交于 2019-12-11 14:45:42
问题 I have a doubt when i read datastax documentation about cassandra write consistency. I have a question on how cassandra will maintain consistent state on following scenario: Write consistency level = Quorum replication factor = 3 As per docs, When a write occurs coordinator node will send this write request to all replicas in a cluster. If one replica succeed and other fails then coordinator node will send error response back to the client but node-1 successfully written the data and that

Has anyone tried to use Shark/Spark on DataStax Enterprise?

拥有回忆 提交于 2019-12-11 14:33:28
问题 I've been trying to achieve this without success. I tried to use the included hive disitribution on dse with shark, however, shark provides with a patched up and older version of Hive (0.9 I believe), which makes shark execution impossible due to incompatibilities. I also tried to use the patched up hive version from shark instead of dse's, recycling the dse hive configuration (in order to make available CFS to shark's hive distribution) only to discover a long list of dependencies from the

Performance degradation with Datastax Cassandra when using multiple map types in a table

拟墨画扇 提交于 2019-12-11 13:49:32
问题 I have the following table with five map type collections. The max number of elements in the collection is 12 and the maximum size of the item is 50 Bytes. # CREATE TABLE persons ( treeid int, personid bigint, birthdate text, birthplace text, clientnote text, clientnoteisprivate boolean, confidence int, connections map<int, bigint>, createddate timestamp, deathdate text, deathplace text, familyrelations map<text, text>, flags int, gender text, givenname text, identifiers map<int, text>,

Iterating a GraphTraversal with GraphFrame causes UnsupportedOperationException Row to Vertex conversion

北慕城南 提交于 2019-12-11 12:08:01
问题 The following GraphTraversal<Row, Edge> traversal = gf().E().hasLabel("foo").limit(5); while (traversal.hasNext()) {} causes the following Exception: java.lang.UnsupportedOperationException: Row to Vertex conversion is not supported: Use .df().collect() instead of the iterator at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator$lzycompute(DseGraphTraversal.scala:92) at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator(DseGraphTraversal.scala:78) at com