I am using the randomSplit
function to get a small amount of a dataframe to use in dev purposes and I end up just taking the first df that is returned by this fu
The method you are looking for is .limit.
Returns a new Dataset by taking the first n rows. The difference between this function and head is that head returns an array while limit returns a new Dataset.
Example usage:
df.limit(1000)
Limit is very simple, example limit first 50 rows
val df_subset = data.limit(50)