Spark-SQL Window functions on Dataframe - Finding first timestamp in a group
问题 I have below dataframe (say UserData). uid region timestamp a 1 1 a 1 2 a 1 3 a 1 4 a 2 5 a 2 6 a 2 7 a 3 8 a 4 9 a 4 10 a 4 11 a 4 12 a 1 13 a 1 14 a 3 15 a 3 16 a 5 17 a 5 18 a 5 19 a 5 20 This data is nothing but user (uid) travelling across different regions (region) at different time (timestamp). Presently, timestamp is shown as 'int' for simplicity. Note that above dataframe will not be necessarily in increasing order of timestamp. Also, there may be some rows in between from different