Can kafka connect - mongo source run as cluster (max.tasks > 1)

孤人 提交于 2021-01-07 07:35:05

问题


I'm using the following mongo-source which is supported by kafka-connect. I found that one of the configurations of the mongo source (from here) is tasks.max.

this means I can provide the connector tasks.max which is > 1, but I fail to understand what it will do behind the scene?

If it will create multiple connectors to listen to mongoDb change stream, then I will end up with duplicate messages. So, does mongo-source really has parallelism and works as a cluster? what does it do if it has more then 1 tasks.max?


回答1:


Mongo-source doesn't support tasks.max > 1. Even if you set it greater than 1 only one task will be pulling data from mongo to Kafka.

How many task is created depends on particular connector. Function List<Map<String, String>> Connector::taskConfigs(int maxTasks), (that should be overridden during the implementation of your connector) return the list, which size determine number of Tasks. If you check mongo-kafka source connector you will see, that it is singletonList.

https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/MongoSourceConnector.java#L47



来源:https://stackoverflow.com/questions/59389861/can-kafka-connect-mongo-source-run-as-cluster-max-tasks-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!