Hive - Thread-safe auto-increment sequence number generation

给你一囗甜甜゛ 提交于 2019-12-23 22:04:48

问题


I have a situation where I need to insert records into a particular Hive table.

One of the columns requires to be an auto-incremented sequence number (that has to strictly follow [max.value + 1] rule at any point of time).

Records are inserted into this particular table from many parallel Hive jobs, that are run in batches - daily, weekly, monthly.

Now, I have these questions:

  1. Will org.apache.hadoop.hive.contrib.udf.UDFRowSequence ( http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java ) be the right choice?

  2. How can I make it thread-safe, since parallel jobs are also involved in inserting the records?

Note: I came across this useful post ( hive auto increment after certain number ) which I continue to watch, but had to raise a fresh one since (1) an answer is already accepted for that question and so may possibly lose attention of the community and (2) my situation includes thread-safe sequence number generation.

来源:https://stackoverflow.com/questions/39121524/hive-thread-safe-auto-increment-sequence-number-generation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!