Storm latency caused by ack

问题

I was using kafka-storm to connect kafka and storm. I have 3 servers running zookeeper, kafka and storm. There is a topic 'test' in kafka that has 9 partitions.

In the storm topology, the number of KafkaSpout executor is 9 and by default, the number of tasks should be 9 as well. And the 'extract' bolt is the only bolt connected to KafkaSpout, the 'log' spout.

From the UI, there is a huge rate of failure in the spout. However, he number of executed message in bolt = the number of emitted message - the number of failed mesage in bolt. This equation is almost matched when the failed message is empty at the beginning.

Based on my understanding, this means that the bolt did receive the message from spout but the ack signals are suspended in flight. That's the reason why the number of acks in spout are so small.

This problem might be solved by increase the timeout seconds and spout pending message number. But this will cause more memory usage and I cannot increase it to infinite.

I was wandering if there is a way to force storm ignore the ack in some spout/bolt, so that it will not waiting for that signal until time out. This should increase the throughout significantly but not guarantee for message processing.

回答1:

if you set the number of ackers to 0 then storm will automatically ack every sample.

config.setNumAckers(0);

please note that the UI only measures and shows 5% of the data flow. unless you set

config.setStatsSampleRate(1.0d);

try increasing the bolt's timeout and reducing the amount of topology.max.spout.pending.

also, make sure the spout's nextTuple() method is non blocking and optimized.

i would also recommend profiling the code, maybe your storm Queues are being filled and you need to increase their sizes.

    config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE,32);
    config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,16384);
    config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,16384);

回答2:

Your capacity numbers are a bit high, leading me to believe that you're really maximizing the use of system resources (CPU, memory). In other words, the system seems to be bogged down a bit and that's probably why tuples are timing out. You might try using the topology.max.spout.pending config property to limit the number of inflight tuples from the spout. If you can reduce the number just enough, the topology should be able to efficiently handle the load without tuples timing out.

来源：https://stackoverflow.com/questions/36193046/storm-latency-caused-by-ack

标签

apache-storm