Creating a hive table with ~40K columns

∥☆過路亽.° 提交于 2019-12-25 02:44:46

问题


I'm trying to create a fairly large table. ~3 millions rows and ~40K columns using hive. To begin, I'm creating an empty table and inserting the data into the table.

However, I hit an error when trying this.

Unable to acquire IMPLICIT, SHARED lock default after 100 attempts. FAILED: Error in acquiring locks: Locks on the underlying objects cannot be acquire. retry after some time

The query is pretty straightforward:

create external table database.dataset (
var1 decimal(10,2),
var2 decimal(10,2),
...
var40000 decimal(10,2)
) location 'hdfs://nameservice1/root/user1/project1';

Anybody seen this error before? Cloudera says there are no limits on number of columns, but clearly hitting some system limitation here.

Additionally, I can create a smaller hive table in the specified location.


回答1:


Ran across this blog post which appears to identify and fix the problem: http://gbif.blogspot.com/2014/03/lots-of-columns-with-hive-and-hbase.html

Short answer: there is a limit on the number of characters hive will pass in a query, but you can increase that with the following option change:

alter table "SERDE_PARAMS" alter column "PARAM_VALUE" type text;

Untested as I went with a different tool to handle the data (for the problem above) since hive was failing for unknown reasons. If you come across something similar, try this out and give an update please.



来源:https://stackoverflow.com/questions/30809749/creating-a-hive-table-with-40k-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!