问题
Can I write output to multiple tables in HBase from my reducer? I went through different blog posts, but ma not able to find a way, even using MultiTableOutputFormat.
I referred to this : Write to multiple tables in HBASE
But not able to figure out the API signature for context.write call.
Reducer code:
public class MyReducer extends TableReducer<Text, Result, Put> {
private static final Logger logger = Logger.getLogger( MyReducer.class );
@SuppressWarnings( "deprecation" )
@Override
protected void reduce( Text key, Iterable<Result> data, Context context ) throws IOException, InterruptedException {
logger.info( "Working on ---> " + key.toString() );
for ( Result res : data ) {
Put put = new Put( res.getRow() );
KeyValue[] raw = res.raw();
for ( KeyValue kv : raw ) {
put.add( kv );
}
context.write( obj, put );
**// I dont know how to give table name here.**
}
}
}
回答1:
To identify the table names you should pass the table name as the key to context.write(key, put) method:
ImmutableBytesWritable key = new ImmutableBytesWritable(Bytes.toBytes("tableName"));
context.write(key, put);
But if you want to load a huge amount of data via MapReduce job at once then it might be interesting for you to use MultiTableHFileOutputFormat. This output format creates HFiles for every HBase table you need and then you can easily load these files with LoadIncrementalHFiles tool:
hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /tmp/multiTableJobResult hbaseTable
You can read more about MultiTableHFileOutputFormat in the article: http://tech.adroll.com/blog/data/2014/07/15/multi-table-bulk-import.html
来源:https://stackoverflow.com/questions/37436095/write-output-to-multiple-tables-from-reducer