I am generating a \'bag\' of information whose size (number of tuples inside the the bag) might vary. From this, I want to extract the first element on the fly. How do I do
If the ordering of the tuple in the bag is important to get the "first" one (of course it is!) then you could do something like the following which is explained in more detail at https://community.hortonworks.com/questions/22863/cant-we-filter-the-data-which-we-have-done-in-37-s.html#answer-22995.
max_runs = FOREACH grp_data {
inner_sorted = ORDER runs BY runs DESC;
first_row = LIMIT inner_sorted 1;
GENERATE first_row AS most_hits;
}