How do I build/run this simple Mahout program without getting exceptions?

旧时模样 提交于 2019-11-27 04:01:25

You need to use the "job" JAR file provided by Mahout. It packages up all the dependencies. You need to add your classes to it too. This is how all the Mahout examples work. You shouldn't put Mahout jars in the Hadoop lib since that sort of "installs" a program too deeply in Hadoop.

if you will take code for examples from https://github.com/tdunning/MiA repository, then it contains ready to use pom.xml file for Maven. And when you compile code with mvn package, then it will create mia-0.1-job.jar in the target directory - this archive contains all dependencies, except Hadoop's, so you can run it on Hadoop cluster without problems

caoimhin
<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-math</artifactId>
    <version>0.7</version>
</dependency>

<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-collections</artifactId>
    <version>1.0</version>
</dependency>

What I did is to set the HADOOP_CLASSPATH with my jar and all the mahout jar files as shown below.

export HADOOP_CLASSPATH=/home/xxx/my.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-core-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-examples-0.7-cdh4.3.0-job.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-integration-0.7-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/mahout/mahout-math-0.7-cdh4.3.0.jar

Then I was able to run hadoop com.mycompany.mahout.CSVtoVector iris/nb/iris1.csv iris/nb/data/iris.seq

So you have to include all your jars and the mahout jar in the HADOOP_CLASSPATH and then you can just run your class with
hadoop <classname>

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!