问题
I am trying to run unit test of spark job in windows 7 64 bit. I have
HADOOP_HOME=D:/winutils
winutils path= D:/winutils/bin/winutils.exe
I ran below commands:
winutils ls \tmp\hive
winutils chmod -R 777 \tmp\hive
But when I run my test I get the below error.
Running com.dnb.trade.ui.ingest.spark.utils.ExperiencesUtilTest
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.132 sec
17/01/24 15:37:53 INFO Remoting: Remoting shut down
17/01/24 15:37:53 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
java.io.IOException: Failed to delete: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:929)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at .....
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=786m; support was removed in 8.0
Caused by: java.lang.RuntimeException: java.io.IOException: Access is denied
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:525)
... 28 more
Caused by: java.io.IOException: Access is denied
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
I have tried to change the permissions manually. Every time I get the same error.
Please help!
回答1:
The issue is in the ShutdownHook that tries to delete the temp files but fails. Though you cannot solve the issue, you can simply hide the exceptions by adding the following 2 lines to your log4j.properties
file in %SPARK_HOME%\conf
. If the file does not exist, copy the log4j.properties.template
and rename it.
log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
log4j.logger.org.apache.spark.SparkEnv=ERROR
Out of sight is out of mind.
回答2:
I'm facing the same problem after trying to run the WordCount example with spark-submit command. Right now, i'm ignoring it because it returns the results before the error happens.
I found some old issues in spark Jira but didn't found any fixes. (BTW, one of them is with the status closed.)
https://issues.apache.org/jira/browse/SPARK-8333
https://issues.apache.org/jira/browse/SPARK-12216
Unfortunately seems that they don't care about spark on windows at all.
One bad solution is to give the Temp folder (in yout case *C:\Users\415387\AppData\Local\Temp*) permission to everyone.
So it will be like that:
winutils chmod -R 777 C:\Users\415387\AppData\Local\Temp\
But I strongly recomend you to not do that.
回答3:
I've set the HADOOP_HOME variable in the same way as you have. (On Windows 10)
Try using the complete path when setting permissions i.e.
D:> winutils/bin/winutils.exe chmod 777 \tmp\hive
This worked for me.
Also, just a note on the exception - I'm getting the same exception on exiting spark from cmd by running "sys.exit".
But... I can exit cleanly when I use ":q" or ":quit". So, not sure what's happening here, still trying to figure out...
回答4:
Running Spark in windows has this deleting Spark temp issue. You can set it as follows to hide it.
Logger.getLogger("org").setLevel(Level.FATAL)
回答5:
I have a workaround for this, instead of letting spark's ShutdownHookManager
to delete the temporary directories you can issue windows commands to do that,
Steps:
Change the temp directory using
spark.local.dir
inspark-defaults.conf
fileSet
log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
inlog4j.properties
filespark-shell
internally callsspark-shell.cmd
file. So addrmdir /q /s "your_dir\tmp"
this should work!
回答6:
I was facing a similar problem. I changed the permission to \tmp folder instead of \tmp\hive
D:>winutils/bin/winutils.exe chmod 777 \tmp
Not seeing any error after this and there is a clean exit
回答7:
I create a directory d:\spark\temp
I give total control to Everybody on this dir
I run
set TEMP=d:\spark\temp
then I submit my jar to spark and watch the directory on the explorer.
Many files and directories are created/deleted but for one of them there is an exception.
Imho this is not a right problem.
java.io.IOException: Failed to delete: D:\data\temp\spark\spark-9cc5a3ad-7d79-4317-8990-f278e63cb40b\userFiles-4c442ed7-83ba-4724-a533-5f171d830913\simple-app_2.11-1.0.jar
this is when trying to delete the submitted package. It may not have been released by all involved process.
回答8:
My Hadoop environment on Windows 10:
HADOOP_HOME=C:\hadoop
Spark and Scala versions:
Spark-2.3.1 and Scala-2.11.8
Below is my spark-submit command:
spark-submit --class SparkScalaTest --master local[*] D:\spark-projects\SparkScalaTest\target\scala-2.11\sparkscalatest_2.11-0.1.jar D:\HDFS\output
Based on my Hadoop environment on Windows 10, I defined the following system properties in my Scala main class:
System.setProperty("hadoop.home.dir", "C:\\hadoop\\")
System.setProperty("hadoop.tmp.dir", "C:\\hadoop\\tmp")
Result: I am getting the same error, but my outputs are getting generated in the output path D:\HDFS\output passed in spark-submit
Hope this helps to bypass this error and get the expected result for Spark running locally on Windows.
回答9:
After following above suggestions, I made below changes -
Update spark-defaults.conf or create a copy of spark-defaults.conf.template
& rename it to spark-defaults.conf
Add following line like - spark.local.dir=E:\spark2.4.6\tempDir via above line we are setting the temp folder for Spark to use.
Similarly update log4j.properties in your spark setup like did above, with the below lines-
log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF log4j.logger.org.apache.spark.SparkEnv=ERROR
Now ShutdownHookManager will not be used during exit causing those error lines on console.
Now how to clean the temp folder then?
So for that add below lines in bin/spark-shell.cmd file -
rmdir /q /s "E:/spark2.4.6/tempDir"
del C:\Users\nitin\AppData\Local\Temp\jansi*.*
By having above updates, I can see clean exit with temp folders clean-up also.
回答10:
for python:
create an empty dir tmp\hive
import os
os.system(command=f"path to \\bin\\winutils.exe chmod -R 777 path to \\tmp\\hive")
来源:https://stackoverflow.com/questions/41825871/exception-while-deleting-spark-temp-dir-in-windows-7-64-bit