Apache Spark does not delete temporary directories

前端 未结 6 522
庸人自扰
庸人自扰 2020-11-27 15:48

After a spark program completes, there are 3 temporary directories remain in the temp directory. The directory names are like this: spark-2e389487-40cc-4a82-a5c7-353c0feefbb

6条回答
  •  抹茶落季
    2020-11-27 16:51

    Three SPARK_WORKER_OPTS exists to support the worker application folder cleanup, copied here for further reference: from Spark Doc

    • spark.worker.cleanup.enabled, default value is false, Enable periodic cleanup of worker / application directories. Note that this only affects standalone mode, as YARN works differently. Only the directories of stopped applications are cleaned up.

    • spark.worker.cleanup.interval, default is 1800, i.e. 30 minutes, Controls the interval, in seconds, at which the worker cleans up old application work dirs on the local machine.

    • spark.worker.cleanup.appDataTtl, default is 7*24*3600 (7 days), The number of seconds to retain application work directories on each worker. This is a Time To Live and should depend on the amount of available disk space you have. Application logs and jars are downloaded to each application work dir. Over time, the work dirs can quickly fill up disk space, especially if you run jobs very frequently.

提交回复
热议问题