问题

So I am relatively new to IPC and I have a c program that collects data and a python program that analyses the data. I want to be able to:

Call the python program as a subprocess of my main c program
Pass a c struct containing the data to be processed to the python process
Return an int value from the python process back to the c program

I have been briefly looking at Pipes and FIFO, but so far cannot find any information to address this kind of problem, since as I understand it, a fork() for example will simply duplicate the calling process, so not what I want as I am trying to call a different process.

回答1:

About fork() and the need to execute a different process. It is true that fork() creates a copy of the current process. But this is usually coupled with exec() (one of the various forms) to get the process copy to execute a different program.

As for IPC, you have several choices. Someone mentioned a queue - but something like ZeroMQ is a overkill. You can do IPC with one of several mechanisms.

Pipes (named pipes or anonymous)
Unix domain sockets
TCP or UDP via the sockets API
Shared memory
Message queues

The pipe approach is the easiest. Note that when you pass data back and forth between the C program and Python, you will need to worry about the transfer syntax of the data. If you choose to use C structs (which can be non portable), you will need to unpack the data on the Python side. Else you can use some textual format - combination of sprintf/sscanf, or JSON etc.

回答2:

I suggest looking at the application and structuring the issues you are confronted with.

Multi-threading

Starting two processes is by far not the biggest issue, as Ziffusion said you can have another process do something else. Plus there are python bindings for C, so you can create another thread for example (no need for it to be a process) and call your python routines from the C program.

Communication

Sharing information is more interesting as you have to solve two issues: one is technically getting the data from one place to another and viceversa; the other is how two different things can work on the same data. This goes into messaging patterns and process flow:

who generates the data?
who receives the data?
is there a piece of code waiting for something before proceeding?
is there the need to control what happens to the data while the data is processed?
do I want to code it myself?
can I use libraries in the project?
are there security limitations?
...

Once you answer the above questions, you can define how your pieces of the application are going to interact. One main distinction is synchronous vs asynchronous.

Sync vs Async

Synchronous means that for every message there is a reply which should be contained in a time envelope of finite (usually as small as possible) size. This in order to avoid latency. This a pattern best used when you have to finely control what's happening, or you need an answer to the question in timely manner. It is, in fact, how http works to download web pages: whenever you load a web site, you want to see the content right now. This is a pattern called REQuest/REPly

Asynchronous is often used in case of heavy processing: the data producer (for example a a database interface, or a sensor) sends a bulk of data to a worker thread, without waiting for an answer. The worker thread then starts doing its job on the data, and when it's done sends the results to a data sink/user. This pattern is called PUBlish/SUBscribe.

There are many others, but these form the basics of communication.

Marshalling

Another issue you face is how to structure the data passing, marshalling. How to get the meaning and content of your data from one context to a totally different one. For example from your C part to your Python part. Maintaining serializing libraries is tedious and perilous not to mention prone to backward compatibility issues.

Implementation

When you come to implementation you usually want the cleanest and most powerful code. The two things are clearly against each other. So I usually go look for a library that can do exactly what I need. In this case my advice is to try ZeroMQ: it is thin, flexible, low-level. It will give you a powerful framework to interface threads, processes and even machines. ZeroMQ provides the link, but you still need a protocol to run over this link. To avoid incredible headaches and streamline your work with respect to the marshaling issue, I suggest you investigate available marshaling libraries that make this task easy. Cap'n proto, Flatbuffers, Protocol buffers (Google, can't post more than 2 links yet) They make it easy to define your data in an intermediate language, and parse it from any other language without you having to write all the classes yourself.

As for pipes and shared memory my humble opinion is: forget they exist.

回答3:

The way you are organizing the architecture is a bit messy. What you really want is Message Queues. So in your example:

Your python worker listens for new information to process in queue A;
Your C program input data in queue A;
Your python worker process the data and queue the result into queue B;
Your C program listens for new items on queue B;

This may vary, but the concept is simple.

They are easy to implement, and has tons of libraries and tools to aid you on this task. ZeroMQ would do for you, for sure. It works with C and Python.

回答4:

If your struct is simple enough, you could even not use IPC at all. Provided, you can serialize it as string parameters that could be used as program arguments and provided the int value to return can be in the range 0-127, you could simply:

in C code:
- prepare the command arguments to pass to the Python script
- fork-exec (assuming a Unix-like system) a Python interpretor with the script path and the script arguments
- wait for child termination
- read what the script passed as code termination
in Python:
- get the arguments from command line and rebuild the elements of the struct
- process it
- end the script with exit(n) where n is an integer in the range 0-127 that will be returned to caller.

If above does not meet your requirements, next level would be to use pipes:

in C code:
- prepare 2 pipe pairs one for C->Python (let's call it input), one for Python->C (let's call it output)
- serialize the struct into a char buffer
- fork
- in child
  - close write side of input pipe
  - close read side of output pipe
  - dup read side of input pipe to file descriptor 0 (stdin) (see `dup2)
  - dup write side of output pipe to file descriptor 1 (stdout)
  - exec a Python interpretor with the name of the script
- in parent
  - close read side of input pipe
  - close write side of output pipe
  - write the buffer (eventually preceded by its size if it cannot be known a priori) to the write side on input file
  - wait for the child to terminate
  - read the return value from the read side of output pipe
in Python:
- read the serialized data from standard input
- process it
- write the output integer to standard output
- exit

来源：https://stackoverflow.com/questions/34315470/ipc-between-c-application-and-python

标签

python

ipc

IPC between C application and Python

问题