问题
I followed each and every step in the Apache Nutch Wiki. I am using MacOSX 10.8.3, my JAVA_HOME
is perfectly set and can even see various command options when bin/nutch
is executed (according to the wiki).
But when I use bin/nutch crawl urls -dir crawl -depth 3 -topN 5
, I get the following error:
bin/nutch: line 104: [: too many arguments
Error: Could not find or load main class Engines
FYI: I have already created a urls
directory in apache-nutch-1.6/urls
Can any one tell what might be the problem?
回答1:
you can try as follow:
First of all, build nutch via ant.
cd nutch-1.x.x/runtime/local/
mkdir urls
(for seed list directory)
mkdir crawl
(for -dir
option)
vim urls/seed
, then you add one or more than one url (ex:http://www.examplesite.com)
bin/nutch crawl urls
--or-- bin/nutch crawl urls -dir crawl -depth 3 -topN 5
回答2:
After some research I figured out that I forgot to set the NUTCH_JAVA_HOME. Here is the step:
set NUTCH_JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home
export NUTCH_JAVA_HOME
And yes I reset the JAVA_HOME as well:
set JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home
export JAVA_HOME
来源:https://stackoverflow.com/questions/16521582/apache-nutch-command-unable-to-execute