Exception while deleting Spark temp dir in Windows 7 64 bit

梦想的初衷 提交于 2021-02-06 15:14:31

问题


I am trying to run unit test of spark job in windows 7 64 bit. I have

HADOOP_HOME=D:/winutils

winutils path= D:/winutils/bin/winutils.exe

I ran below commands:

winutils ls \tmp\hive
winutils chmod -R 777  \tmp\hive

But when I run my test I get the below error.

Running com.dnb.trade.ui.ingest.spark.utils.ExperiencesUtilTest
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.132 sec
17/01/24 15:37:53 INFO Remoting: Remoting shut down
17/01/24 15:37:53 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
java.io.IOException: Failed to delete: C:\Users\415387\AppData\Local\Temp\spark-b1672cf6-989f-4890-93a0-c945ff147554
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:929)
        at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
        at .....

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=786m; support was removed in 8.0

Caused by: java.lang.RuntimeException: java.io.IOException: Access is denied
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:525)
        ... 28 more
Caused by: java.io.IOException: Access is denied
        at java.io.WinNTFileSystem.createFileExclusively(Native Method)

I have tried to change the permissions manually. Every time I get the same error.

Please help!


回答1:


The issue is in the ShutdownHook that tries to delete the temp files but fails. Though you cannot solve the issue, you can simply hide the exceptions by adding the following 2 lines to your log4j.properties file in %SPARK_HOME%\conf. If the file does not exist, copy the log4j.properties.template and rename it.

log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
log4j.logger.org.apache.spark.SparkEnv=ERROR

Out of sight is out of mind.




回答2:


I'm facing the same problem after trying to run the WordCount example with spark-submit command. Right now, i'm ignoring it because it returns the results before the error happens.

I found some old issues in spark Jira but didn't found any fixes. (BTW, one of them is with the status closed.)

https://issues.apache.org/jira/browse/SPARK-8333

https://issues.apache.org/jira/browse/SPARK-12216

Unfortunately seems that they don't care about spark on windows at all.

One bad solution is to give the Temp folder (in yout case *C:\Users\415387\AppData\Local\Temp*) permission to everyone.

So it will be like that:

winutils chmod -R 777 C:\Users\415387\AppData\Local\Temp\

But I strongly recomend you to not do that.




回答3:


I've set the HADOOP_HOME variable in the same way as you have. (On Windows 10)

Try using the complete path when setting permissions i.e.

D:> winutils/bin/winutils.exe chmod 777 \tmp\hive

This worked for me.

Also, just a note on the exception - I'm getting the same exception on exiting spark from cmd by running "sys.exit".

But... I can exit cleanly when I use ":q" or ":quit". So, not sure what's happening here, still trying to figure out...




回答4:


Running Spark in windows has this deleting Spark temp issue. You can set it as follows to hide it.

Logger.getLogger("org").setLevel(Level.FATAL)




回答5:


I have a workaround for this, instead of letting spark's ShutdownHookManager to delete the temporary directories you can issue windows commands to do that,

Steps:

  1. Change the temp directory using spark.local.dir in spark-defaults.conf file

  2. Set log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF in log4j.properties file

  3. spark-shell internally calls spark-shell.cmd file. So add rmdir /q /s "your_dir\tmp"

this should work!




回答6:


I was facing a similar problem. I changed the permission to \tmp folder instead of \tmp\hive

D:>winutils/bin/winutils.exe chmod 777 \tmp

Not seeing any error after this and there is a clean exit




回答7:


I create a directory d:\spark\temp

I give total control to Everybody on this dir

I run

set TEMP=d:\spark\temp

then I submit my jar to spark and watch the directory on the explorer.

Many files and directories are created/deleted but for one of them there is an exception.

Imho this is not a right problem.

java.io.IOException: Failed to delete: D:\data\temp\spark\spark-9cc5a3ad-7d79-4317-8990-f278e63cb40b\userFiles-4c442ed7-83ba-4724-a533-5f171d830913\simple-app_2.11-1.0.jar

this is when trying to delete the submitted package. It may not have been released by all involved process.




回答8:


My Hadoop environment on Windows 10:

HADOOP_HOME=C:\hadoop

Spark and Scala versions:

Spark-2.3.1 and Scala-2.11.8

Below is my spark-submit command:

spark-submit --class SparkScalaTest --master local[*] D:\spark-projects\SparkScalaTest\target\scala-2.11\sparkscalatest_2.11-0.1.jar D:\HDFS\output

Based on my Hadoop environment on Windows 10, I defined the following system properties in my Scala main class:

System.setProperty("hadoop.home.dir", "C:\\hadoop\\")
System.setProperty("hadoop.tmp.dir", "C:\\hadoop\\tmp")

Result: I am getting the same error, but my outputs are getting generated in the output path D:\HDFS\output passed in spark-submit

Hope this helps to bypass this error and get the expected result for Spark running locally on Windows.




回答9:


After following above suggestions, I made below changes -

Update spark-defaults.conf or create a copy of spark-defaults.conf.template
& rename it to spark-defaults.conf

Add following line like - spark.local.dir=E:\spark2.4.6\tempDir via above line we are setting the temp folder for Spark to use.

Similarly update log4j.properties in your spark setup like did above, with the below lines-

log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF log4j.logger.org.apache.spark.SparkEnv=ERROR

Now ShutdownHookManager will not be used during exit causing those error lines on console.

Now how to clean the temp folder then?
So for that add below lines in bin/spark-shell.cmd file -

rmdir /q /s "E:/spark2.4.6/tempDir"
del C:\Users\nitin\AppData\Local\Temp\jansi*.*

By having above updates, I can see clean exit with temp folders clean-up also.




回答10:


for python:

create an empty dir tmp\hive

import os
os.system(command=f"path to \\bin\\winutils.exe chmod -R 777 path to \\tmp\\hive")


来源:https://stackoverflow.com/questions/41825871/exception-while-deleting-spark-temp-dir-in-windows-7-64-bit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!