Storm-Kafka multiple spouts, how to share the load?

余生长醉 提交于 2019-12-03 02:55:54
mithunsatheesh

I had come across a discussion in storm-user which discuss something similar.

Read Relationship between Spout parallelism and number of kafka partitions.


2 things to note while using kafka-spout for storm

  1. The maximum parallelism you can have on a KafkaSpout is the number of partitions.
  2. We can split the load into multiple kafka topics and have separate spout instances for each. ie. each spout handling a separate topic.

So if we have a case where kafka partitions per host is configured as 1 and the number of hosts is 2. Even if we set the spout parallelism as 10, the max value which is repected will only be 2 which is the number of partitions.


How To mention the number of partition in the Kafka-spout?

List<HostPort> hosts = new ArrayList<HostPort>();
hosts.add(new HostPort("localhost",9092));
SpoutConfig objConfig=new SpoutConfig(new KafkaConfig.StaticHosts(hosts, 4), "spoutCaliber", "/kafkastorm", "discovery");

As you can see, here brokers can be added using hosts.add and the partion number is specified as 4 in the new KafkaConfig.StaticHosts(hosts, 4) code snippet.


How To mention the parallelism hint in the Kafka-spout?

builder.setSpout("spout", spout,4);

You can mention the same while adding your spout into the topology using setSpout method. Here 4 is the parallelism hint.


More links that might help

Understanding-the-parallelism-of-a-Storm-topology

what-is-the-task-in-twitter-storm-parallelism


Disclaimer: !! i am new to both storm and java !!!! So pls edit/add if its required some where.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!