Distinct on specific column in Hive

醉酒当歌 提交于 2019-12-10 01:23:23

问题


I am running Hive 071 I have a table, with mulitple rows, with the same column value e.g.

 x | y |
---------
 1 | 2 |
 1 | 3 |
 1 | 4 |
 2 | 2 |
 3 | 2 |
 3 | 1 |

I want to have the x column unique, and remove rows that have the same x val e.g.

 x | y |
---------
 1 | 2 |
 2 | 2 |
 3 | 2 |

or

 x | y |
---------
 1 | 4 |
 2 | 2 |
 3 | 1 |

are both good as distinct works only on the whole rs in hive, I couldn't find a way to do it

help please Tx


回答1:


You can use the distinct keyword:

SELECT DISTINCT x FROM table



回答2:


try following query to get result :

select A.x , A.y from (select x , y , rank() over ( partition by x order by y) as ranked from testingg)A where ranked=1;



来源:https://stackoverflow.com/questions/7401543/distinct-on-specific-column-in-hive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!