Correct way of writing two floats into a regular txt

北慕城南 提交于 2020-01-15 05:58:05

问题


I am running a big job, in cluster mode. However, I am only interested in two floats numbers, which I want to read somehow, when the job succeeds.

Here what I am trying:

from pyspark.context import SparkContext

if __name__ == "__main__":
    sc = SparkContext(appName='foo')

    f = open('foo.txt', 'w')
    pi = 3.14
    not_pi = 2.79 
    f.write(str(pi) + "\n")
    f.write(str(not_pi) + "\n")
    f.close()

    sc.stop()

However, 'foo.txt' doesn't appear to be written anywhere (probably it gets written in an executor, or something). I tried '/homes/gsamaras/foo.txt', which would be the pwd of the gateway. However, it says: No such file or directory: '/homes/gsamaras/myfile.txt'.

How to do that?


import os, sys
import socket
print "Current working dir : %s" % os.getcwd()
print(socket.gethostname())

suggest that the driver is actually a node of the cluster, that's why I don't see the file in my gateway.

Maybe write the file in the HDFS somehow?

This won't work either:

Traceback (most recent call last):
  File "computeCostAndUnbalancedFactorkMeans.py", line 15, in <module>
    f = open('hdfs://myfile.txt','w')
IOError: [Errno 2] No such file or directory: 'hdfs://myfile.txt'

回答1:


At the first glance there is nothing particularly (you should context manager in case like this instead of manually closing but it is not the point) wrong with your code. If this script is passed to spark-submit file will be written to the directory local to the driver code.

If you submit your code in the cluster mode it will be an arbitrary worker node in your cluster. If you're in doubt you can always log os.getcwd() and socket.gethostname() to figure out which machine is used and what is the working directory.

Finally you cannot use standard Python IO tools to write to HDFS. There a few tools which can achieve that including native dask/hdfs3.



来源:https://stackoverflow.com/questions/39303218/correct-way-of-writing-two-floats-into-a-regular-txt

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!