Calculate the percentage of categories in a column in Hive

删除回忆录丶 提交于 2020-01-13 05:42:10

问题


I have a table, colors in Hive that looks like this:

 id cname
 1 Blue
 2 Green
 3 Green
 4 Blue
 5 Blue

I need help with writing a Hive query that gives the percentages of each color in the cname column. Something that looks like this:

Blue  60%
Green 40%

Thanks in advance!


回答1:


Using analytics functions:

select cname, concat(pct, ' %') pct
from
(
select (
        count(*) over (partition by cname)/
        count(*) over ()
       )*100 as pct,
       cname
  from (--Replace this subquery with your table
        select stack (5,
                      1, 'Blue',
                      2, 'Green',
                      3, 'Green',
                      4, 'Blue',
                      5, 'Blue' )  as (id, cname)

        ) colors
)s
group by cname, pct;

Result:

OK
Blue    60.0 %
Green   40.0 %

Just replace colors subquery with your table



来源:https://stackoverflow.com/questions/52467496/calculate-the-percentage-of-categories-in-a-column-in-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!