is it possible to add Tasks dynamically at runtime in apache storm not just rebalance executors

随声附和 提交于 2019-12-12 02:09:42

问题


I need a functionality in storm that i know (based on the docs) has not been yet implemented. I need to add more tasks at runtime without the need to have an initial large number of tasks, because it might cause performance issues. because Running more than one task per executor does not increase the level of parallelism -- an executor always has one thread that it uses for all of its tasks, which means that tasks run serially on an executor.

I know that rebalance command can be used to add executors ans worker processes at runtime and there is a rule that #executors<=#tasks and this means that number of tasks should be static at runtime, but i'm curious how hard is it(if not impossible) to add this feature to storm.

Is there a way to implement this functionality in storm or it can't be done at all? if there is a way please give me clue how to do it.


回答1:


Not sure what you mean by "since those extra tasks run serially".

Tasks is Storm are use to exploit data parallelism. In theory it's possible to add code to change the number of tasks at runtime. But it would be a huge change and AFAIK there are no plans to add this feature.

Compare http://storm.apache.org/releases/1.0.3/Understanding-the-parallelism-of-a-Storm-topology.html

Because keys are assigned to tasks hash based, changing the number of tasks would require to rehash all keys to new tasks. If an operator builds up an key-based internal state, this state would need to get partitioned by key and redistributed accordingly, too.



来源:https://stackoverflow.com/questions/42487777/is-it-possible-to-add-tasks-dynamically-at-runtime-in-apache-storm-not-just-reba

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!