Count number of rows in an RDD

前端 未结 2 533
走了就别回头了
走了就别回头了 2020-12-08 11:05

I\'m using spark with java, and i hava an RDD of 5 millions rows. Is there a sollution that allows me to calculate the number of rows of my RDD. I\'ve tried RDD.count(

2条回答
  •  眼角桃花
    2020-12-08 11:31

    Daniel's explanation of count is right on the money. If you are willing to accept an approximation, though, you could try the countApprox(timeout: Long, confidence: Double = 0.95): PartialResult[BoundedDouble] RDD method. (Note, though, that this is tagged as "Experimental").

提交回复
热议问题