Efficient way of sending the same data to multiple dynamic processes

|▌冷眼眸甩不掉的悲伤 提交于 2021-02-08 08:43:19

问题


I have a stream of line-buffered data, and many readers from other processes
The readers need to attach to the system dynamically, they are not known to the process writing the stream

First i tried to read every line and simply send them to a lot of pipes

#writer
command | while read -r line; do
    printf '%s\n' "$line" | tee listeners/*
done

#reader
mkfifo listeners/1
cat listeners/1

But that's consume a lot of CPU

So i though about writing to a file and cleaning it repeatedly

#writer
command >> file &
while true; do
    : > file
    sleep 1
done

#reader
tail -f -n0 file

But sometimes, a line is not read by one or more readers before truncation, making a race condition
Is there a better way on how i could implement this?


回答1:


Sounds like pub/sub to me - see Wikipedia.

Basically, new interested parties come along whenever they like and "subscribe" to your channel. The process receiving the data then "publishes" it, line by line, to that channel.

You can do it with MQTT using mosquitto or with Redis. Both have command-line interfaces/bindings, as well as Python, C/C++, Ruby, PHP etc. Client and server need not be on same machine, some clients could be elsewhere on the network.

Mosquitto example here.


I did a few tests on my Mac with Redis pub/sub. The client code in Terminal to subscribe to a channel called myStream looks like this:

redis-cli SUBSCRIBE myStream

I then ran a process to synthesise 10,000 lines like this:

time seq 10000  | while read a ; do redis-cli PUBLISH myStream "$a" >/dev/null 2>&1 ; done

And that takes 40s, so it does around 250 lines per second, but it has to start a whole new process for each line and create and tear down the connection to Redis... and we don't want to send your CPU mad.

More appropriately for your situation then, here is how you can create a file with 100,000 lines, and read them one at a time, and send them to all your subscribers in Python:

# Make a "BigFile" with 100,000 lines
seq 100000 > BigFile

and read the lines and publish them with:

#!/usr/bin/env python3

import redis

if __name__ == '__main__':
    # Redis connection
    r = redis.Redis(host='localhost', port=6379, db=0)

    # Read file line by line...
    with open('BigFile', 'r') as infile:
        for line in infile:
            # Publish the current line to subscribers
            r.publish('myStream', line)

The entire 100,000 lines were sent and received in 4s, so 25,000 lines per second. Here is a little recording of it in action. At the top you can see the CPU is not unduly troubled by it. The second window from the top is a client, receiving 100,000 lines and the next window down is a second client. The bottom window shows the server running the Python code above and sending all 100,000 lines in 4s.

Keywords: Redis, mosquitto, pub/sub, publish, subscribe.



来源:https://stackoverflow.com/questions/59914544/efficient-way-of-sending-the-same-data-to-multiple-dynamic-processes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!