Apache Nutch Command Unable to Execute

回眸只為那壹抹淺笑 提交于 2019-12-23 02:57:51

问题


I followed each and every step in the Apache Nutch Wiki. I am using MacOSX 10.8.3, my JAVA_HOME is perfectly set and can even see various command options when bin/nutch is executed (according to the wiki).

But when I use bin/nutch crawl urls -dir crawl -depth 3 -topN 5, I get the following error:

bin/nutch: line 104: [: too many arguments
Error: Could not find or load main class Engines

FYI: I have already created a urls directory in apache-nutch-1.6/urls

Can any one tell what might be the problem?


回答1:


you can try as follow:

First of all, build nutch via ant.

cd nutch-1.x.x/runtime/local/

mkdir urls (for seed list directory)

mkdir crawl (for -dir option)

vim urls/seed , then you add one or more than one url (ex:http://www.examplesite.com)

bin/nutch crawl urls --or-- bin/nutch crawl urls -dir crawl -depth 3 -topN 5




回答2:


After some research I figured out that I forgot to set the NUTCH_JAVA_HOME. Here is the step:

set NUTCH_JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home
export NUTCH_JAVA_HOME

And yes I reset the JAVA_HOME as well:

set JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home
export JAVA_HOME


来源:https://stackoverflow.com/questions/16521582/apache-nutch-command-unable-to-execute

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!