“EOF error” at program exit using multiprocessing Queue and Thread

北慕城南 提交于 2019-12-06 07:24:22

The issue comes from a conflict between multiple atexit.register() calls.

The documentation states that:

atexit runs these functions in the reverse order in which they were registered; if you register A, B, and C, at interpreter termination time they will be run in the order C, B, A.

[...]

The assumption is that lower level modules will normally be imported before higher level modules and thus must be cleaned up later.

By first importing multiprocessing and then calling atexit.register(my_stop), you would expect your stop function to be executed before any internal termination procedure... But this is not the case, because atexit.register() may be called dynamically.

In the present case, the multiprocessing library makes use of a _exit_function function which is meant to cleanly close internal threads and queues. This function is registered in atexit at the module level, however the module is only loaded once the Queue() object is initialized.

Consequently, the MyClass stop function is registered before the multiprocessing's one and thus instance.stop is called after _exit_function.

During its termination, _exit_function closes internal pipes connections, so if the thread later try to call .get() with a closed read-connection, an EOFError is raised. This happens only if Python did not have time to automatically kill the daemon thread at the end, that is if a "slow" exit function (like time.sleep(0.1) or in this case thread.join()) is register and run after the usual closure procedure. For some reason, the write-connection shutdown is delayed hence .put() does not raise error immediately.

As to why small modifications to the snippet makes it work: SimpleQueue does not have Finalizer so internal pipe is closed later. The internal thread of Queue is not started until the first .put() is called so removing it means there is no pipe to close. It is also posible to force registeration by importing multiprocessing.queues.

To achieve it you can define __enter__ and __exit__ inside your class and create your instance using with statement:

import threading
import multiprocessing


class MyClass:

    def __init__(self):
        self.queue = None
        self.thread = None

    def __enter__(self):
        return self

    def __exit__(self, type, value, traceback):
        self.stop()

    def start(self):
        self.queue = multiprocessing.Queue()
        self.thread = threading.Thread(target=self.queued_writer, daemon=True)
        self.thread.start()

    def queued_writer(self):
        while 1:
            msg = self.queue.get()
            print("Message:", str(msg))
            if msg is None:
                break

    def put(self, msg):
        self.queue.put(msg)

    def stop(self):
        self.queue.put(None)
        self.thread.join()


with MyClass() as instance:
    instance.start()
    print('Thread stopped: ' + str(instance.thread._is_stopped))
    instance.put('abc')

print('Thread stopped: ' + str(instance.thread._is_stopped))

Above code gives as an output:

Thread stopped: False
Message: abc
Message: None
Thread stopped: True

The surface answer to your question is fairly simple, The queued_writer process is still waiting for entries to be written to the queue when the main process ends, sending an EOF to the open blocking connection that self.queue.get opened.

That raises the question of why the atexit.register didn't seem to do it's job, but of that I do not know the reason for.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!