问题
I am using the majordomo code found here (https://github.com/zeromq/majordomo) in the following manner:
Instead of using a single broker to process the requests and replies, I start two brokers such that one of them handles all the requests, and the other handles all the replies.
I did some testing to see how many connections the majordomo broker can handle:
num of reqs per client num of requests handled without pkt loss
1 614 (614 clients)
10 6000 (600 clients)
100 35500 (355 clients)
1000 300000 (300 clients)
5000 750000
10000 600000
15000 450000
20000 420000
25000 375000
30000 360000
I am not able to understand the results properly.
Why is the broker able to handle only 614 clients when each one is sending only a single request?
I ran this test within a single machine, but still 614 seems very low.
Can someone please tell what could be going wrong?
So I set the HWM as follows:
Broker’s HWM on send/receive is set to 40 k.
TCP send/receive buffer is set to 10 MB.
Worker’s HWM on send/receive is set to 100 k.
Client’s HWM on send is set to 100,
and on receive is set to 100 k.
All the clients run on the same machine.
All the workers (10 workers running the echo service),
and the two broker instances run on a single ec2 instance.
Client program simply sends all the requests in a blast (all at once).
My understanding of HWM on send is that when the HWM is reached, the socket will block. That is why I have set the client's send HWM to 100 messages, hoping that this would give me some sort of flow control.
Now, I see packet loss when I have 10 clients sending 10,000 requests (all in one go). And, when clients send 10,000 requests each, but only the first 1000 are sent in one go, then packet loss occurs when 128 clients run in parallel.
When I have set the broker's HWM set to 40k, then why does it drop packets when the blast size is less than 40,000 (like the ones I have used above)? I know that the zmq guide says that the allocated capacity of the pipe will be around 60% of what we have set it to, but 10,000 is only 25% of what I have set it to (40,000). Just the same way, 1000 is only 10%. So I don't understand what causes the broker to lose packets. HWM is supposed to be per peer connection, isn't it? Please help me in understanding this behavior.
回答1:
WHY THAT HAPPENS?
TLDR
Let me quote from a marvelous and precious source -- Pieter HINTJENS' book
"Code Connected, Volume 1"
( definitely worth spending anyone's time and step through the PDF copy ... key messages are in the text and stories that Pieter has crafted into his 300+ thrilling pages )
High-Water Marks
When you can send messages rapidly from process to process, you soon discover that memory is a precious resource, and one that can be trivially filled up. A few seconds of delay somewhere in a process can turn into a backlog that blows up a server unless you understand the problem and take precautions.
...
ØMQ uses the concept of HWM
(high-water mark) to define the capacity of its internal pipes. Each connection out of a socket or into a socket has its own pipe, and HWM
for sending, and/or receiving, depending on the socket type. Some sockets (PUB
, PUSH
) only have send buffers. Some (SUB
, PULL
, REQ
, REP
) only have receive buffers. Some (DEALER
, ROUTER
, PAIR
) have both send and receive buffers.
In ØMQ v2.x, the HWM
was infinite by default. This was easy but also typically fatal for high-volume publishers. In ØMQ v3.x, it’s set to 1,000 by default, which is more sensible. If you’re still using ØMQ v2.x, you should always set a HWM
on your sockets, be it 1,000 to match ØMQ v3.x or another figure that takes into account your message sizes and expected subscriber performance.
When your socket reaches its HWM
, it will either block or drop data depending on the socket type. PUB
and ROUTER
sockets will drop data if they reach their HWM
, while other socket types will block. Over the inproc
transport, the sender and receiver share the same buffers, so the real HWM
is the sum of the HWM
set by both sides.
Lastly, the HWM
-s are not exact; while you may get up to 1,000 messages by default, the real buffer size may be much lower (as little as half), due to the way libzmq
implements its queues.
SOLUTION
Experiment with adjusting your RCVHWM
/ SNDHWM
and other low-level IO-thread / API-parameters so that your testing setup remains both memory-footprint feasible, stable and performing well in accord with your IO-resources-incompressible-data-"hydraulics"
来源:https://stackoverflow.com/questions/28120776/majordomo-broker-handling-large-number-of-connections