问题
I am using HDP 2.1. for the cluster. I've encountered below exception and the MapReduce jobs have been failed because of that. Actually, we regularly create tables using the data from Flume which is ver. 1.4. and I checked the data files which mapper tried to read but I couldn't find anything on that.
2014-11-28 00:08:28,696 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate
configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2014-11-28 00:08:28,947 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2014-11-28 00:08:28,947 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2014-11-28 00:08:28,995 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2014-11-28 00:08:29,009 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1417095534232_0051, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@ea23517)
2014-11-28 00:08:29,184 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2014-11-28 00:08:29,735 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /hadoop1/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop2/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop3/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop4/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop5/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051,/hadoop6/hadoop/yarn/local/usercache/xxx/appcache/application_1417095534232_0051
2014-11-28 00:08:31,067 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2014-11-28 00:08:32,806 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2014-11-28 00:08:33,837 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: com.hadoop.mapred.DeprecatedLzoTextInputFormat:hdfs://cluster/apps/hive/external/mapp_log/dt=2014-11-27/mapp_parse_log.1417014001075:402653184+67311787
2014-11-28 00:08:34,196 INFO [main] org.apache.hadoop.hive.ql.log.PerfLogger: <PERFLOG method=deserializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
2014-11-28 00:08:34,196 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: Deserializing MapWork via kryo
2014-11-28 00:08:35,222 INFO [main] org.apache.hadoop.hive.ql.log.PerfLogger: </PERFLOG method=deserializePlan start=1417100914196 end=1417100915222 duration=1026 from=org.apache.hadoop.hive.ql.exec.Utilities>
2014-11-28 00:08:35,254 INFO [main] com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2014-11-28 00:08:35,260 INFO [main] com.hadoop.compression.lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev dbd51f0fb61f5347228a7a23fe0765ac1242fcdf]
2014-11-28 00:08:35,498 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1879195946-xx.xx.xx.32-1409281631059:blk_1075462091_1722425; getBlockSize()=202923; corrupt=false; offset=469762048; locs=[xx.xx.xx.36:50010, xx.xx.xx.37:50010]}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:573)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1879195946-xx.xx.xx.32-1409281631059:blk_1075462091_1722425; getBlockSize()=202923; corrupt=false; offset=469762048; locs=[xx.xx.xx.36:50010, xx.xx.xx.37:50010]}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:350)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:294)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:231)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:224)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1295)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:161)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:239)
... 9 more
2014-11-28 00:08:35,503 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
If you have some idea or solutions for the problem, please share for me....
Thank you kwangwoo
回答1:
Here is a good description of the problem and its cause:
https://community.hortonworks.com/answers/37414/view.html
For us running the command hdfs debug recoverLease -path <path-of-the-file> -retries 3
solved the problem.
回答2:
It is very hard to determine if the file in any HDFS folder is unclosed or not. You probably have to do a hdfs cat test on them. Or you can regularly check for lost file blocks (every hour or after every restart of cluster).
回答3:
I got the same issue with you. There are some files that opened by flume but never closed (I am not sure about the reason). You need to find the name of them by the command:
hdfs fsck /directory/of/locked/files/ -files -openforwrite
Then just removing them.
Or you can try to recover files as command hdfs debug recoverLease -path <path-of-the-file> -retries 3
that Joe23 suggested.
来源:https://stackoverflow.com/questions/27181371/java-io-ioexception-cannot-obtain-block-length-for-locatedblock