How to get a sample with an exact sample size in Spark RDD?

前端未结

关注

 2  566

南旧 2020-12-08 15:42

Why does the rdd.sample() function on Spark RDD return a different number of elements even though the fraction parameter is the same? For example, if my code is

2条回答

小蘑菇 (楼主)

2020-12-08 16:19
Another way can be to first takeSample and then make RDD. This might be slow with large datasets.
```
sc.makeRDD(a.takeSample(false, 1000, 1234))
```
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...