When performing a shuffle my Spark job fails and says \"no space left on device\", but when I run df -h it says I have free space left! Why does this happen, a
By default Spark uses the /tmp directory to store intermediate data. If you actually do have space left on some device -- you can alter this by creating the file SPARK_HOME/conf/spark-defaults.conf and adding the line. Here SPARK_HOME is wherever you root directory for the spark install is.
spark.local.dir SOME/DIR/WHERE/YOU/HAVE/SPACE