datastax-enterprise

New Solr node in “Active - Joining” state for several days

霸气de小男生 提交于 2019-12-25 04:19:43
问题 We are trying to add a new Solr node to our cluster: DC Cassandra Cassandra node 1 DC Solr Solr node 1 <-- new node (actually, a replacement for an old node; we followed the steps for "replacing a dead node") Solr node 2 Solr node 3 Solr node 4 Solr node 5 Our Cassandra data is approximately 962gb. Replication factor is 1 for both DCs. Is it normal for the new node to be in "Active - Joining" state for several days? Is there a way to know the progress? Last week, there was a time when we had

unconfigured columnfamily error on consecutive execute calls (CQL)

蹲街弑〆低调 提交于 2019-12-24 17:00:10
问题 I'm using the cassandra python driver for datastax's distro. mah codez... from cassandra.io.libevreactor import LibevConnection from cassandra.cluster import Cluster cluster = Cluster(['some ip addr']) cluster.connection_class = LibevConnection fails: session = cluster.connect('demodb') session.execute("INSERT INTO colFamName(attr1, attr2) VALUES ('123jkd', 'sdflkj')") session.execute("SELECT attr1 FROM colFamName") passes: session = cluster.connect('demodb') session.execute("INSERT INTO

Improve speed of spark app

被刻印的时光 ゝ 提交于 2019-12-24 16:43:29
问题 This is part of my python-spark code which parts of it run too slow for my needs. Especially this part of the code, which I would really like to improve it's speed but don't know how to. It currently takes around 1 minute for 60 Million data rows and I would like to improve it to under 10 seconds. sqlContext.read.format("org.apache.spark.sql.cassandra").options(table="axes", keyspace=source).load() More context of my spark app: article_ids = sqlContext.read.format("org.apache.spark.sql

sstableloader does not exit after successful data loading

夙愿已清 提交于 2019-12-24 13:14:06
问题 I'm trying to bulk-load my data into DSE but sstableloader doesn't exit after a successful run. According to the output, the progress for each node is already 100% and the progress total also shows 100% Environment: CentOS 6.x x86_64; DSE 4.0.1 Topology: 1 Cassandra node, 5 Solr nodes (DC auto-assigned by DSE); RF 2 System ulimit (hard, soft) in each DSE node: 65536 sstableloader heap size (-Xmx): 10240M (10G) SSTables size: 158gb (from 80gb CSV, 241m rows) I tried to take down all nodes -

What is role of bloom filter in cassandra?

℡╲_俬逩灬. 提交于 2019-12-23 09:04:51
问题 From two different links of the Cassandra's documentation, I found: link 1 A structure stored in memory that checks if row data exists in the memtable before accessing SSTables on disk and link2 Cassandra checks the Bloom filter to discover which SSTables are likely to have the request partition data. My question is does both the above statements are right? If yes, does bloom filters maintained for a Memtable and SSTable separately? Thanks in advance. 回答1: A Bloom filter is a generic data

Using ComplexPhraseQueryParser in Datastax search

允我心安 提交于 2019-12-23 05:32:11
问题 I want to perform complex searches in Datastax search. On solr wiki page, it is suggested to use a complex phrase query parser to do the work (https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser) However, the syntax did not work - so it seems i need to plug it in additionally. I am using Datastax enterprise 4.5. Is there any particular procedure to plug in the parser - maybe put it in particular location and make specific changes to get it

DataStax Enterprise 4.8.4 on Ubuntu 14.04 LTS install error when using apt repository installation

别等时光非礼了梦想. 提交于 2019-12-23 05:16:12
问题 I used the apt repository installation approach exactly as outlined at docs.datastax.com/en/datastax_enterprise/install/installDEBdse.html There were no issues with the key. I have Oracle Server JDK 8 (latest as of today). Python 2.7 from Miniconda (also fresh install today), using defaults and allowing for prepending of PATH variables. Following installation, there are errors regarding unmet dependencies: dse-full : Depends: dse (=4.8.4-1) but it is not going to be installed Depends: dse

My Datastax Spark doesn't work with my current python version and I have no idea why?

会有一股神秘感。 提交于 2019-12-23 04:53:28
问题 Below is my error message. When I use python 2.7 in Datastax Spark with the code below it doesn't work. I don't know why. Would be very grateful for some suggestions. Thanks vi /etc/dse/spark/spark-env.sh export PYTHONHOME=/usr/local export PYTHONPATH=/usr/local/lib/python2.7 export PYSPARK_PYTHON=/usr/local/bin/python2.7 Error message: Error from python worker: /usr/local/bin/python2.7: /usr/local/lib/python2.7/lib-dynload/_io.so: undefined symbol: _PyCodec_LookupTextEncoding PYTHONPATH was:

Datastax Enterprise 5.0 Lifecycle Manager - provisioning fails

南楼画角 提交于 2019-12-23 03:25:08
问题 I'm trying to install Datastax Enterprise 5.0 by using Lifecycle Manager (I've installed Opscenter). But install job fails with a message that is not too descriptive, and obviously pretty rare (0 hits in Google): Event Type: error-MeldError Result/Message: Meld failed on: name="<nodename>" ssh-management-address="<nodeip>" node-id="<someguid>" job-id="<someguid>" stdout="" stderr="" Stack trace: lcm.jobs.multinode.common$monitor_command.invoke(common.clj:547) lcm.jobs.multinode.common$run

Two node DSE spark cluster error setting up second node. Why?

巧了我就是萌 提交于 2019-12-23 01:52:27
问题 I have DSE spark cluster with 2 nodes. One DSE analytics node with spark cannot start after I install it. Without spark it starts just fine. But on my other node spark is enabled and it can start and works just fine. Why is that and how can I solve that? Thanks. Here is my error log: ERROR [main] 2016-02-27 20:35:43,353 CassandraDaemon.java:294 - Fatal exception during initialization org.apache.cassandra.exceptions.ConfigurationException: Cannot start node if snitch's data center (Analytics)