giraph | 易学教程

Convert csv data to graph data

阅读更多关于 Convert csv data to graph data

问题 I am experimenting Apache Giraph.I need to create a simple graph for my csv file residing in HDFS,which shows a relationship between 2 columns.(victim related to store name) My data size is of above 1Gb csv format.Initially tried to use neo4j using java with local file.But it is only capable of loading small data and cannot import data directly from HDFS. My data may increase.So thought of using Apache Giraph. But how to achieve the same? Hope apache giraph only takes input in vertext format

Giraph's estimated cluster heap 4096MB ask is greater than the current available cluster heap of 0MB. Aborting Job

阅读更多关于 Giraph's estimated cluster heap 4096MB ask is greater than the current available cluster heap of 0MB. Aborting Job

问题 I'm running Giraph using Hadoop 2.5.2 on a 5 node cluster. But when I try to run the SimpleShortestPathsComputation example, I get this error: Exception in thread "main" java.lang.IllegalStateException: Giraph's estimated cluster heap 2000MB ask is greater than the current available cluster heap of 0MB. Aborting Job. So far I've been unable to determine why Giraph thinks the cluster has a 0MB heap. I've set YARN_HEAPSIZE and HADOOP_HEAPSIZE in yarn-env.sh and hadoop-env.sh respectively, and

Apache Giraph - Cannot run in split master / worker mode since there is only 1 task at a time

阅读更多关于 Apache Giraph - Cannot run in split master / worker mode since there is only 1 task at a time

问题 I ran Giraph 1.0.0 with hadoop 2.2.0 using the PageRank Benchmark example here. Suddenly I got this error result: Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have only one worker since only 1 task at a time! at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:151) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:225) at org.apache.giraph.benchmark.GiraphBenchmark.run

Gremlin - Giraph - GraphX ? On TitanDb

阅读更多关于 Gremlin - Giraph - GraphX ? On TitanDb

问题 I need some help to be confirm my choice... and to learn if you can give me some information. My storage database is TitanDb with Cassandra. I have a very large graph. My goal is to use Mllib on the graph latter. My first idea : use Titan with GraphX but I did not found anything or in development in progress... TinkerPop is not ready yet. So I have a look to Giraph. TinkerPop, Titan can communique with Rexster from TinkerPop. My question is : What are the benefit to use Giraph ? Gremlin seems

Giraph best's Vertex Input format, for an input file with ids of type String

阅读更多关于 Giraph best's Vertex Input format, for an input file with ids of type String

问题 I have a multinode giraph cluster working properly in my PC. I executed the SimpleShortestPathExample from Giraph and was executed fine. This algorithm was ran with this file (tiny_graph.txt): [0,0,[[1,1],[3,3]]] [1,0,[[0,1],[2,2],[3,1]]] [2,0,[[1,2],[4,4]]] [3,0,[[0,3],[1,1],[4,4]]] [4,0,[[3,4],[2,4]]] This file has the following input format: [source_id,source_value,[[dest_id, edge_value],...]] Now, I’m trying to execute this same algorithm, in this same cluster, but with an input file

apache giraph build error

阅读更多关于 apache giraph build error

问题 I got following error in compiling giraph. I'm using ubuntu 16.04 with java 1.8 and maven 3.3.9. Follows detail of mvn -version command: Apache Maven 3.3.9 Maven home: /usr/share/maven Java version: 1.8.0_171, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre I cloned with following comand git clone http://git-wip-us.apache.org/repos/asf/giraph.git Hence I tryed following maven commands but I got always the same error. Could you please tell me what is my error? 1°

java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1

阅读更多关于 java.io.IOException: ensureRemaining: Only 0 bytes remaining, trying to read 1

问题 i'm having some problems with custom classes in giraph. I made a VertexInput and Output format, but i always getting the following error: java.io.IOException: ensureRemaining: Only * bytes remaining, trying to read * with different values where the "*" are placed. This was tested on a Single Node Cluster. This problem happen when a vertexIterator do next(), and there aren't any more vertex left. This iterator it's invocated from a flush method, but i don't understand, basically, why the "next

Giraph ZooKeeper port problems

阅读更多关于 Giraph ZooKeeper port problems

问题 I am trying to run the SimpleShortestPathsVertex (aka SimpleShortestPathComputation) example described in the Giraph Quick Start. I am running this on a Hortonworks Sandbox instance (HDP 2.1) using VirtualBox, and I packaged giraph.jar using profile hadoop_2.0.0. When I try to run the example using hadoop jar giraph.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/hue

Vertices with complex values in Apache Giraph

阅读更多关于 Vertices with complex values in Apache Giraph

问题 I am trying to read some text file containing relevant vertices information into Giraph: each line is vertex_id attribute_1 attribute_2 .....attribute_n where each attribute is a string. The goal would be to create a vertex where all these attributes are part of vertex's value. Looking up the various input formats I could not find anything out of the box, so I assume I have to derive my vertex input class from VertexValueInputFormat (I have a separate reader for edges). Problem is: how? I

ClassNotFoundException running GiraphRunner on a modified SimpleShortestPathsVertex

阅读更多关于 ClassNotFoundException running GiraphRunner on a modified SimpleShortestPathsVertex

问题 I'm relatively new to Giraph and I'm trying to get my Giraph edit-compile-deploy loop working for our code. I am able to run various examples inspired by http://blog.cloudera.com/blog/2014/02/how-to-write-and-run-giraph-jobs-on-hadoop/ , but I'm stuck with a ClassNotFoundException when running my modified version of the SimpleShortestPathsVertex Giraph example. I've tried various combinations of -libjars and HADOOP_CLASSPATH, but I'm out of ideas and I'd really appreciate your help. Details