python subprocess deadlock in demonized server

巧了我就是萌 提交于 2020-01-02 21:55:26

问题


I'm trying to set up a remote backup server for dar, along these lines. I'd really like to do all the piping with python if possible, but I've asked a separate question about that.

Using netcat in subprocess.Popen(cmd, shell=True), I succeeded in making a differential backup, as in the examples on the dar site. The only two problems are:

  1. I don't know how to assign port numbers dynamically this way
  2. If I execute the server in the background, it fails. Why?

Update: This doesn't seem to be related to netcat; it hangs even without netcat in the mix.

Here's my code:

from socket import socket, AF_INET, SOCK_STREAM
import os, sys
import SocketServer
import subprocess

class DarHandler(SocketServer.BaseRequestHandler):
    def handle(self):
        print('entering handler')
        data = self.request.recv(1024).strip()
        print('got: ' + data)
        if data == 'xform':
            cmd1 = 'nc -dl 41201 | dar_slave archives/remotehost | nc -l 41202'
            print(cmd1)
            cmd2 = 'nc -dl 41200 | dar_xform -s 10k - archives/diffbackup'
            print(cmd2)
            proc1 = subprocess.Popen(cmd1, shell=True)
            proc2 = subprocess.Popen(cmd2, shell=True)
            print('sending port number')
            self.request.send('41200')
            print('waiting')
            result = str(proc1.wait())
            print('nc-dar_slave-nc returned ' + result)
            result = str(proc2.wait())
            print('nc-dar_xform returned ' + result)
        else:
            result = 'bad request'
        self.request.send(result)
        print('send result, exiting handler')

myaddress = ('localhost', 18010)
def server():
    server = SocketServer.TCPServer(myaddress, DarHandler)
    print('listening')
    server.serve_forever()

def client():
    sock = socket(AF_INET, SOCK_STREAM)
    print('connecting')
    sock.connect(('localhost', 18010))
    print('connected, sending request')
    sock.send('xform')
    print('waiting for response')
    port = sock.recv(1024)
    print('got: ' + port)
    try:
        os.unlink('toslave')
    except:
        pass
    os.mkfifo('toslave')
    cmd1 = 'nc -w3 localhost 41201 < toslave'
    cmd2 = 'nc -w3 localhost 41202 | dar -B config/test.dcf -A - -o toslave -c - | nc -w3 localhost ' + port
    print(cmd2)
    proc1 = subprocess.Popen(cmd1, shell=True)
    proc2 = subprocess.Popen(cmd2, shell=True)
    print('waiting')
    result2 = proc2.wait()
    result1 = proc1.wait()
    print('nc<fifo returned: ' + str(result1))
    print('nc-dar-nc returned: ' + str(result2))
    result = sock.recv(1024)
    print('received: ' + result)
    sock.close()
    print('socket closed, exiting')

if __name__ == "__main__":
    if sys.argv[1].startswith('serv'):
        server()
    else:
        client()

Here's what happens on the server:

$ python clientserver.py serve &
[1] 4651
$ listening
entering handler
got: xform
nc -dl 41201 | dar_slave archives/remotehost | nc -l 41202
nc -dl 41200 | dar_xform -s 10k - archives/diffbackup
sending port number
waiting

[1]+  Stopped                 python clientserver.py serve

Here's what happens on the client:

$ python clientserver.py client
connecting
connected, sending request
waiting for response
got: 41200
nc -w3 localhost 41202 | dar -B config/test.dcf -A - -o toslave -c - | nc -w3 localhost 41200
waiting
FATAL error, aborting operation
Corrupted data read on pipe
nc<fifo returned: 1
nc-dar-nc returned: 1

The client also hangs, and I have to kill it with a keyboard interrupt.


回答1:


I'd cut my losses and start over. This solution attempt is very complicated and kludgy. There are many ready-made solutions in the area:

  • 10 outstanding Linux backup utilities

Fwbackups sounds good if you want to take the easy route, rsync+ssh for the hard core.




回答2:


  1. Use Popen.communicate() instead of Popen.wait().

    The python documentation for wait() states:

    Warning: This will deadlock if the child process generates enough output to a stdout or stderr pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

  2. Dar and its related executables should get a -Q if they aren't running interactively.

  3. When syncronizing multiple processes, make sure to call communicate() on the 'weakest link' first: dar_slave before dar_xform and dar before cat. This was done correctly in the question, but it's worth noting.

  4. Clean up shared resources. The client process is holding open a socket from which dar_xform is still reading. Attempting to send/recv data on the initial socket after dar and friends are finished without closing that socket will therefore cause a deadlock.

Here is a working example which doesn't use shell=True or netcat. An advantage of this is I can have the secondary ports assigned dynamically and therefore could conceivably serve multiple backup clients simultaneously.



来源:https://stackoverflow.com/questions/7922477/python-subprocess-deadlock-in-demonized-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!