Un-persisting all dataframes in (py)spark

前端 未结 3 1923
猫巷女王i
猫巷女王i 2020-12-29 20:33

I am a spark application with several points where I would like to persist the current state. This is usually after a large step, or caching a state that I would like to use

3条回答
  •  自闭症患者
    2020-12-29 21:06

    Spark 2.x

    You can use Catalog.clearCache:

    from pyspark.sql import SparkSession
    
    spark = SparkSession.builder.getOrCreate
    ...
    spark.catalog.clearCache()
    

    Spark 1.x

    You can use SQLContext.clearCache method which

    Removes all cached tables from the in-memory cache.

    from pyspark.sql import SQLContext
    from pyspark import SparkContext
    
    sqlContext = SQLContext.getOrCreate(SparkContext.getOrCreate())
    ...
    sqlContext.clearCache()
    

提交回复
热议问题