Getting error as Failed to create data storage when trying to load the data from HDFS with MovieLens data

爱⌒轻易说出口 提交于 2019-12-13 02:26:09

问题


I am trying to load data from HDFS to Pig but I am getting error as Failed to create Data Storage. The command that I executed was:

movies = LOAD 'hdfs://localhost:9000/Movie_Lens/ratings' USING PigStorage(':') AS (user_id, dummy1, movie_id, dummy2, movie_rating, dummy3, timestamp);

I tried to find the mentioned problem in stack overflow but the link that I got are not related to HDFS and Pig, they are related to HDFS and HBase or Pig and HBase.

The detail of the log file is mentioned below.

Somewhere in the log file I found this mentioned: Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

Pig Stack Trace

ERROR 1200: Failed to create DataStorage

Failed to parse: Failed to create DataStorage
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201)
    at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:623)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1082)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
    at org.apache.pig.Main.run(Main.java:565)
    at org.apache.pig.Main.main(Main.java:177)
Caused by: java.lang.RuntimeException: Failed to create DataStorage
    at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
    at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:53)
    at org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:109)
    at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:189)
    at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:538)
    at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
    at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
    at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
    at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
    at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
    at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
    at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
    ... 10 more
Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy4.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:70)
    ... 23 more

To solve this problem I tried doing 'ant' so when I run the command

bash ant -version

in ant bin folder it is working but when I am running the command

bash ant clean jar-all -Dhadoopversion=23

in bin folder it is not working. In some of the links I found that new version of pig does not have jar-all command so I tried the following command

bash ant clean jar -Dhadoopversion=23

and this command is also not working.

来源:https://stackoverflow.com/questions/34254404/getting-error-as-failed-to-create-data-storage-when-trying-to-load-the-data-from

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!