问题
I have trouble understanding why this simple program raises an EOFError at the end.
I am using a Queue() to communicate with a Thread() that I want to automatically and cleanly terminate atexit of my program.
import threading
import multiprocessing
import atexit
class MyClass:
def __init__(self):
self.queue = None
self.thread = None
def start(self):
self.queue = multiprocessing.Queue()
self.thread = threading.Thread(target=self.queued_writer, daemon=True)
self.thread.start()
# Remove this: no error
self.queue.put("message")
def queued_writer(self):
while 1:
msg = self.queue.get()
print("Message:", msg)
if msg is None:
break
def stop(self):
self.queue.put(None)
self.thread.join()
instance = MyClass()
atexit.register(instance.stop)
# Put this before register: no error
instance.start()
This raises:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "test.py", line 21, in queued_writer
msg = self.queue.get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 94, in get
res = self._recv_bytes()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
Moreover, this snippet behaves strangely: if I remove the self.queue.put("message") line, no error is raised and the thread exits successfully. Similarly, this seems to work if the instance.start() is call before atexit.register().
Does anyone know from where could come the error please?
Edit: I noticed that using a SimpleQueue() seems to make the error disappear.
回答1:
The issue comes from a conflict between multiple atexit.register() calls.
The documentation states that:
atexitruns these functions in the reverse order in which they were registered; if you registerA,B, andC, at interpreter termination time they will be run in the orderC,B,A.[...]
The assumption is that lower level modules will normally be imported before higher level modules and thus must be cleaned up later.
By first importing multiprocessing and then calling atexit.register(my_stop), you would expect your stop function to be executed before any internal termination procedure... But this is not the case, because atexit.register() may be called dynamically.
In the present case, the multiprocessing library makes use of a _exit_function function which is meant to cleanly close internal threads and queues. This function is registered in atexit at the module level, however the module is only loaded once the Queue() object is initialized.
Consequently, the MyClass stop function is registered before the multiprocessing's one and thus instance.stop is called after _exit_function.
During its termination, _exit_function closes internal pipes connections, so if the thread later try to call .get() with a closed read-connection, an EOFError is raised. This happens only if Python did not have time to automatically kill the daemon thread at the end, that is if a "slow" exit function (like time.sleep(0.1) or in this case thread.join()) is register and run after the usual closure procedure. For some reason, the write-connection shutdown is delayed hence .put() does not raise error immediately.
As to why small modifications to the snippet makes it work: SimpleQueue does not have Finalizer so internal pipe is closed later. The internal thread of Queue is not started until the first .put() is called so removing it means there is no pipe to close. It is also posible to force registeration by importing multiprocessing.queues.
回答2:
To achieve it you can define __enter__ and __exit__ inside your class and create your instance using with statement:
import threading
import multiprocessing
class MyClass:
def __init__(self):
self.queue = None
self.thread = None
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.stop()
def start(self):
self.queue = multiprocessing.Queue()
self.thread = threading.Thread(target=self.queued_writer, daemon=True)
self.thread.start()
def queued_writer(self):
while 1:
msg = self.queue.get()
print("Message:", str(msg))
if msg is None:
break
def put(self, msg):
self.queue.put(msg)
def stop(self):
self.queue.put(None)
self.thread.join()
with MyClass() as instance:
instance.start()
print('Thread stopped: ' + str(instance.thread._is_stopped))
instance.put('abc')
print('Thread stopped: ' + str(instance.thread._is_stopped))
Above code gives as an output:
Thread stopped: False
Message: abc
Message: None
Thread stopped: True
回答3:
The surface answer to your question is fairly simple, The queued_writer process is still waiting for entries to be written to the queue when the main process ends, sending an EOF to the open blocking connection that self.queue.get opened.
That raises the question of why the atexit.register didn't seem to do it's job, but of that I do not know the reason for.
来源:https://stackoverflow.com/questions/49209385/eof-error-at-program-exit-using-multiprocessing-queue-and-thread