Apache Nutch error: Injector: java.io.IOException: (null) entry in command string: null chmod 0644

 ̄綄美尐妖づ 提交于 2019-12-24 09:58:28

问题


I am using Apache Nutch 1.14 on Windows 10 having java 1.8. I have followed the same steps as mentioned on https://wiki.apache.org/nutch/NutchTutorial.

When I try to inject the URLs in crawldb using the command on cygwin : bin/nutch inject crawl/crawldb urls

I get the following error: Injector: java.io.IOException: (null) entry in command string: null chmod 0644 E:\apache-nutch-1.4\runtime\local\crawl\crawldb.locked at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)

I checked the logs and found this:

2018-01-18 10:55:26,785 ERROR util.Shell - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

I have searched for this error on several pages but none was of help.


回答1:


  1. make new directory in windows e.g c:\winutil.
  2. inside winutil create bin directory
  3. open https://minhaskamal.github.io/DownGit/#/home
  4. paste https://github.com/steveloughran/winutils/tree/master/hadoop-2.8.1 in above website, and download the winutil-hadoop2.8.1
  5. extract the zip content in c:\winutil\bin
  6. add HADOOP_HOME variable to your system variable and make it refer to c:\winutil
  7. re-run your crawl command in cygin


来源:https://stackoverflow.com/questions/48314451/apache-nutch-error-injector-java-io-ioexception-null-entry-in-command-strin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!