datastax-enterprise

DSE Solr nodes and vnodes

白昼怎懂夜的黑 提交于 2019-12-20 03:13:10
问题 The following documentation pages say that it is not recommended to use vnodes for Solr/Hadoop nodes: http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchIntro.html http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/deploy/deployConfigRep.html#configReplication What is the exact problem with using vnodes for these node types? I inherited a DSE setup wherein the Search nodes all use vnodes, and I wonder if I should take down

How do I see SOLR dynamic fields in CQL with Cassandra?

不羁的心 提交于 2019-12-20 03:08:25
问题 Solr dynamic fields appear as searchable in Solr and available in the Thrift interface, but when using CQL, the fields don't appear. Is there a specific search style or querying style that can be used to expose what the dynamic fields are and their values? 回答1: Through CQL3 Dynamic fields should work as well with a few caveats. You need to declare the type as a map (eg: dyn_ map) and create the CQL schema. Post your schema with the dynamic type declared. The dynamic part isn't inferred inside

Why not enable virtual node in an Hadoop node?

邮差的信 提交于 2019-12-20 02:54:08
问题 In url: http://www.datastax.com/docs/datastax_enterprise3.2/solutions/about_hadoop "Before starting an analytics/Hadoop node on a production cluster or data center, it is important to disable the virtual node configuration." What will happen if I enable virtual node in an analytics/Hadoop node? 回答1: If you enable virtual nodes on hadoop node, it will lower performance of small Hadoop jobs by raising the number of mappers to at least the number of virtual nodes. E.g. if you use the default 256

Pig & Cassandra & DataStax Splits Control

两盒软妹~` 提交于 2019-12-19 09:08:14
问题 I have been using Pig with my Cassandra data to do all kinds of amazing feats of groupings that would be almost impossible to write imperatively. I am using DataStax's integration of Hadoop & Cassandra, and I have to say it is quite impressive. Hat-off to those guys!! I have a pretty small sandbox cluster (2-nodes) where I am putting this system thru some tests. I have a CQL table that has ~53M rows (about 350 bytes ea.), and I notice that the Mapper later takes a very long time to grind thru

Starting cassandra as a service does not work for 2.0.5, sudo cassandra -f works

ぃ、小莉子 提交于 2019-12-19 04:19:29
问题 When I try to start cassandra on ubuntu 12.04 (installed via Datastax's dsc20 package) as a service as follows : $ sudo service cassandra start it says *could not access pidfile for Cassandra & no other messages or anything in logs. But when I try to run as a root user( sudo cassandra -f ) it just works properly & cassandra is started. While trying to debug I found that when trying to run as a non-root user I was getting these messages: ERROR 17:48:08,432 Exception encountered during startup

how to return subgraph from gremlin that is in an easily consumable format for Java

此生再无相见时 提交于 2019-12-18 09:48:23
问题 I am very frustrated about very simple things when I try to do a single traversal and bring a lot of stuff from DSE Graph 5.0 at once using Gremlin.. In my simplified case I have: 1 entity with specific uuid entity can have zero (see optional) or more types I need to be able to return the entity and the types What I have so far that works is very ugly :( List list = g.V().hasLabel("Entity").has("uuid","6708ec6d-4518-4159-9005-9e9d642f157e").as("entity") .optional(outE("IsOfType").as("types"))

How to load Spark Cassandra Connector in the shell?

落花浮王杯 提交于 2019-12-17 17:30:20
问题 I am trying to use Spark Cassandra Connector in Spark 1.1.0. I have successfully built the jar file from the master branch on GitHub and have gotten the included demos to work. However, when I try to load the jar files into the spark-shell I can't import any of the classes from the com.datastax.spark.connector package. I have tried using the --jars option on spark-shell and adding the directory with the jar file to Java's CLASSPATH. Neither of these options work. In fact, when I use the -

how to download dse.jar

走远了吗. 提交于 2019-12-14 03:42:57
问题 I am trying to use DataStax Enterprise 4.6 to write a Spark application in Java and run it in DSE's Spark analytics mode. The code for creating a Spark context using DSEConfHelper : SparkConf conf = DseSparkConfHelper.enrichSparkConf(new SparkConf()) .setAppName( "My application"); To use DSEConfHelper we need to import com.datastax.bdp.spark.DseSparkConfHelper which is located in dse.jar . In my pom.xml I have included the dependency: <dependency> <groupId>com.datastax</groupId> <artifactId

Cassandra SOLR Rolling Upgrade

家住魔仙堡 提交于 2019-12-14 03:13:54
问题 We have a cluster of 12 nodes, 6 DSE-SOLR and 6 DSE-Cassandra. When upgrading from 3.0 to 3.1 we noticed that requests through the SOLR interface were broken until all nodes had been upgraded. Is this limitation still present when upgrading from 3.1 to 3.2? Are there any gotchas to note when making the upgrade? In the upgrade path docs it says to enable the old gossip protocol until all nodes have been upgraded, is this per DC or for the entire cluster? 回答1: Russ, What errors are you getting

Remote Spark Job fails: No assemblies found

落爺英雄遲暮 提交于 2019-12-13 17:22:51
问题 I am running a Spark job in a Vanilla Spark (Datastax) with this conf: val conf: SparkConf = new SparkConf() .setAppName("Fuzzy") .setMaster("spark://127.0.0.1:7077") .set("spark.cores.max", "2") .setJars(Seq("my-jar")) val sc: SparkContext = SparkContext.getOrCreate(conf) val NUM_SAMPLES: Int = 500 val join = sc.parallelize(1 to NUM_SAMPLES).filter { _ => val x = math.random val y = math.random x*x + y*y < 1 }.count() println(s"Pi is roughly ${4.0 * join / NUM_SAMPLES}") This is an Spark