Is there a way to take the first 1000 rows of a Spark Dataframe?

后端 未结 2 1028
走了就别回头了
走了就别回头了 2020-12-23 19:06

I am using the randomSplitfunction to get a small amount of a dataframe to use in dev purposes and I end up just taking the first df that is returned by this fu

相关标签:
2条回答
  • 2020-12-23 19:09

    The method you are looking for is .limit.

    Returns a new Dataset by taking the first n rows. The difference between this function and head is that head returns an array while limit returns a new Dataset.

    Example usage:

    df.limit(1000)
    
    0 讨论(0)
  • 2020-12-23 19:27

    Limit is very simple, example limit first 50 rows

    val df_subset = data.limit(50)
    
    0 讨论(0)
提交回复
热议问题