Would Spark unpersist the RDD itself when it realizes it won't be used anymore?

后端未结

关注

 2  1868

粉色の甜心 2020-12-05 07:02

We can persist an RDD into memory and/or disk when we want to use it more than once. However, do we have to unpersist it ourselves later on, or does Spark does some kind of

2条回答

醉酒成梦 (楼主)

2020-12-05 07:25

As pointed out by @Daniel, Spark will remove partitions from the cache. This will happen once there is no more memory available, and will be done using a least-recently-used algorithm. It is not a smart system, as pointed out by @eliasah.

If you are not caching too many objects you don't have to worry about it. If you cache too many objects, the JVM collection times will become excessive, so it is a good idea to unpersist them in this case.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...