Implementing Limit query in Hive

不想你离开。 提交于 2019-12-13 02:27:30

问题


For my requirement i have to implement upper and lower limit in hive. For that i am trying to write query something like this

SELECT * FROM `your_table` LIMIT 0, 5 
SELECT * FROM `your_table` LIMIT 5, 5 

But hive supports only 1 limit, it's not supporting upper and lower limit. I tried with with other alternatives to achieve this by using RANK(), ROWNUM() but didn't succeeded.

Can anyone please help me to solve this. Thanks in advance.


回答1:


Hi you can use the Facebook UDF and rownum functionality

Download the Facbook UDF's from GITHUB https://github.com/brndnmtthws/facebook-hive-udfs

Create a jar file from the the UDF project

You can add the jar file from the local path in the hive console.

ADD JAR s3n://obfuscated-path/assets/jars/facebook-udfs-1.0.jar;
CREATE TEMPORARY FUNCTION NumberRows AS 'com.facebook.hive.udf.UDFNumberRows';

SELECT 
  A.product_id, 
  A.category, 
  A.product_name, 
  A.brand, 
  A.rank_score,
  CAST(NumberRows(A.category) as FLOAT), 
FROM (
  SELECT 
    product_id, 
    category, 
    product_name, 
    brand,
    A.rank_score
  FROM
    source_table
  DISTRIBUTE BY 
    category 
  SORT BY
    category, rank_score desc
  ) A ;

Some more reference https://issues.apache.org/jira/browse/HIVE-1545

How can I add row numbers for rows in PIG or HIVE?



来源:https://stackoverflow.com/questions/24385105/implementing-limit-query-in-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!