How to do unit testing of custom RecordReader and InputFormat classes?

我是研究僧i 提交于 2019-12-19 09:37:45

问题


I have developed one map-reduce program. I have written custom RecordReader and InputFormat classes.

I am using MR Unit and Mockito for unit testing of mapper and reducer.

I would like to know how to unit test custom RecordReader and InputFormat classes? What is the most preferred way to test these classes?


回答1:


thanks to user7610

compiled and somewhat tested version of the example code from the answer

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.TaskAttemptID;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
import org.apache.hadoop.util.ReflectionUtils;
import java.io.File;

Configuration conf = new Configuration(false);
conf.set("fs.default.name", "file:///");

File testFile = new File("path/to/file");
Path path = new Path(testFile.getAbsoluteFile().toURI());
FileSplit split = new FileSplit(path, 0, testFile.length(), null);

InputFormat inputFormat = ReflectionUtils.newInstance(MyInputFormat.class, conf);
TaskAttemptContext context = new TaskAttemptContextImpl(conf, new TaskAttemptID());
RecordReader reader = inputFormat.createRecordReader(split, context);

reader.initialize(split, context);



回答2:


You'll need a test file to be available (i'm assuming your input format extends FileInputFormat). Once you have this you can configure a Configuration object to use the LocalFileSystem (fs.default.name or fs.defaultFS set to file:///). Finally you'll need to define a FileSplit with the path, offset and length of the flie (part of the file).

// DISCLAIMER: untested or compiled
Configuration conf = new Configuration(false);
conf.set("fs.default.name", "file:///");

File testFile = new File("path/to/file");
FileSplit split = new FileSplit(
       testFile.getAbsoluteFile().toURI().toString(), 0, 
       testFile.getLength(), null); 

MyInputFormat inputFormat = ReflectionUtils.newInstance(Myinputformat.class, conf);
RecordReader reader = inputFormat.createRecordReader(split, 
       new TaskAttemptContext(conf, new TaskAttemptID()));

Now you can assert the records returned from the reader match that of what you would expect. You should also test (if your file format supports it) changing the offset and length of the split, as well as creating a compressed version of the file.



来源:https://stackoverflow.com/questions/20371953/how-to-do-unit-testing-of-custom-recordreader-and-inputformat-classes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!