Is there a way to keep the duplicates in a collected set in Hive, or simulate the sort of aggregate collection that Hive provides using some other method? I want to aggregat
For what it's worth (though I know this is an older post), Hive 0.13.0 features a new collect_list() function that does not deduplicate.