the table is:
create table test (
id string,
name string,
age string,
modified string)
data like this:
id name age modife
There's a nearly undocumented feature of Hive SQL (I found it in one of their Jira bug reports) that lets you do something like argmax() using struct()s. For example if you have a table like:
test_argmax
id,val,key
1,1,A
1,2,B
1,3,C
1,2,D
2,1,E
2,1,U
2,2,V
2,3,W
2,2,X
2,1,Y
You can do this:
select
max(struct(val, key, id)).col1 as max_val,
max(struct(val, key, id)).col2 as max_key,
max(struct(val, key, id)).col3 as max_id
from test_argmax
group by id
and get the result:
max_val,max_key,max_id
3,C,1
3,W,2
I think in case of ties on val (the first struct element) it will fall back to comparison on the second column. I also haven't figured out whether there's a neater syntax for getting the individual columns back out of the resulting struct, maybe using named_struct somehow?