发表新帖

发表新帖

Drop spark dataframe from cache

后端未结

关注

 2  854

南旧 2020-12-23 21:02

I am using Spark 1.3.0 with python api. While transforming huge dataframes, I cache many DFs for faster execution;

df1.cache()
df2.cache()

2条回答

长情又很酷 (楼主)

2020-12-23 21:35
just do the following:
```
df1.unpersist()
df2.unpersist()
```
Spark automatically monitors cache usage on each node and drops out old data partitions in a least-recently-used (LRU) fashion. If you would like to manually remove an RDD instead of waiting for it to fall out of the cache, use the RDD.unpersist() method.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题