Write output to multiple tables from REDUCER

ぐ巨炮叔叔 提交于 2019-12-11 13:22:18

问题


Can I write output to multiple tables in HBase from my reducer? I went through different blog posts, but ma not able to find a way, even using MultiTableOutputFormat.

I referred to this : Write to multiple tables in HBASE

But not able to figure out the API signature for context.write call.

Reducer code:

public class MyReducer extends TableReducer<Text, Result, Put> {

    private static final Logger logger = Logger.getLogger( MyReducer.class );

    @SuppressWarnings( "deprecation" )
    @Override
    protected void reduce( Text key, Iterable<Result> data, Context context ) throws IOException, InterruptedException {
        logger.info( "Working on ---> " + key.toString() );
        for ( Result res : data ) {
            Put put = new Put( res.getRow() );
            KeyValue[] raw = res.raw();
            for ( KeyValue kv : raw ) {
                put.add( kv );
            }

            context.write( obj, put );
            **// I dont know how to give table name here.**

        }
    }
}

回答1:


To identify the table names you should pass the table name as the key to context.write(key, put) method:

ImmutableBytesWritable key = new ImmutableBytesWritable(Bytes.toBytes("tableName"));
context.write(key, put);

But if you want to load a huge amount of data via MapReduce job at once then it might be interesting for you to use MultiTableHFileOutputFormat. This output format creates HFiles for every HBase table you need and then you can easily load these files with LoadIncrementalHFiles tool:

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles /tmp/multiTableJobResult hbaseTable

You can read more about MultiTableHFileOutputFormat in the article: http://tech.adroll.com/blog/data/2014/07/15/multi-table-bulk-import.html



来源:https://stackoverflow.com/questions/37436095/write-output-to-multiple-tables-from-reducer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!