Copy Data from one hbase table to another

核能气质少年 提交于 2019-12-10 18:17:47

问题


I have created one table hivetest which also create the table in hbase with name of 'hbasetest'. Now I want to copy 'hbasetest' data into another hbase table(say logdata) with the same schema. So, can anyone help me how do copy the data from 'hbasetest' to 'logdata' without using the hive.

CREATE TABLE hivetest(cookie string, timespent string, pageviews string, visit string, logdate string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "m:timespent, m:pageviews, m:visit, m:logdate")
TBLPROPERTIES ("hbase.table.name" = "hbasetest");

Updated question :

I have created the table logdata like this. But, I am getting the following error.

create 'logdata', {NAME => ' m', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS =>'0', TTL => '2147483647', BLOCKSIZE=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

13/09/23 12:57:19 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_0, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family  m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
    at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
: 755 times, servers with issues: master:60020, 
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
    at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953)
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/09/23 12:57:29 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_1, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family  m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
    at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
: 755 times, servers with issues: master:60020, 
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
    at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953)
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/09/23 12:57:38 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_2, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family  m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
    at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
: 755 times, servers with issues: master:60020, 
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450)
    at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916)
    at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953)
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/09/23 12:57:53 INFO mapred.JobClient: Job complete: job_201309231115_0025
13/09/23 12:57:53 INFO mapred.JobClient: Counters: 7
13/09/23 12:57:53 INFO mapred.JobClient:   Job Counters 
13/09/23 12:57:53 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=34605
13/09/23 12:57:53 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/09/23 12:57:53 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/09/23 12:57:53 INFO mapred.JobClient:     Rack-local map tasks=4
13/09/23 12:57:53 INFO mapred.JobClient:     Launched map tasks=4
13/09/23 12:57:53 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/09/23 12:57:53 INFO mapred.JobClient:     Failed map tasks=1

回答1:


Use the copyTable command. Example :

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=logdata hbasetest



回答2:


Actually i am using hive-0.9.0. Which has a bug

https://issues.apache.org/jira/browse/HIVE-3243.

So, while creating the table SerDe of HBaseStorageHandler doesn't ignore white space between comma and column family. Hence you need to remove the white spaces. Then it will work fine.



来源:https://stackoverflow.com/questions/18942895/copy-data-from-one-hbase-table-to-another

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!