apache-kafka-connect

Write a custom Kafka connect single message transform

五迷三道 提交于 2021-01-24 11:47:45
问题 I have configured a kafka connect Mongodb sink and I want to transform the message by implementing some custom logic. Is Kafka connect limited to in-built SMTs (or) is it possible to write a custom SMT. If not how can I achieve this? through streams? 回答1: Is Kafka connect limited to in-built SMTs No, it is not. You can create your own and add them to your plugin path Transformations are compiled as JARs and are made available to Kafka Connect via the plugin.path specified in the Connect

Write a custom Kafka connect single message transform

天大地大妈咪最大 提交于 2021-01-24 11:46:52
问题 I have configured a kafka connect Mongodb sink and I want to transform the message by implementing some custom logic. Is Kafka connect limited to in-built SMTs (or) is it possible to write a custom SMT. If not how can I achieve this? through streams? 回答1: Is Kafka connect limited to in-built SMTs No, it is not. You can create your own and add them to your plugin path Transformations are compiled as JARs and are made available to Kafka Connect via the plugin.path specified in the Connect

Write a custom Kafka connect single message transform

大兔子大兔子 提交于 2021-01-24 11:46:12
问题 I have configured a kafka connect Mongodb sink and I want to transform the message by implementing some custom logic. Is Kafka connect limited to in-built SMTs (or) is it possible to write a custom SMT. If not how can I achieve this? through streams? 回答1: Is Kafka connect limited to in-built SMTs No, it is not. You can create your own and add them to your plugin path Transformations are compiled as JARs and are made available to Kafka Connect via the plugin.path specified in the Connect

Kafka Connect S3 Connector OutOfMemory errors with TimeBasedPartitioner

邮差的信 提交于 2021-01-21 03:51:19
问题 I'm currently working with the Kafka Connect S3 Sink Connector 3.3.1 to copy Kafka messages over to S3 and I have OutOfMemory errors when processing late data. I know it looks like a long question, but I tried my best to make it clear and simple to understand. I highly appreciate your help. High level info The connector does a simple byte to byte copy of the Kafka messages and add the length of the message at the beginning of the byte array (for decompression purposes). This is the role of

Can kafka connect - mongo source run as cluster (max.tasks > 1)

孤人 提交于 2021-01-07 07:35:05
问题 I'm using the following mongo-source which is supported by kafka-connect. I found that one of the configurations of the mongo source (from here) is tasks.max . this means I can provide the connector tasks.max which is > 1, but I fail to understand what it will do behind the scene? If it will create multiple connectors to listen to mongoDb change stream, then I will end up with duplicate messages. So, does mongo-source really has parallelism and works as a cluster? what does it do if it has

Kafka connect consumer referencing offset and storing in message

瘦欲@ 提交于 2021-01-04 06:41:49
问题 If I am using kafka-connect to consume messages and store to s3 (using the kafka-connect s3 connector), is there anyway I can store the message offset along with the event payload? I would like to have this data to put some order on the messages and also to check if there could be any gaps or check if there were any duplicates in the messages I have received. (e.g. if my consumer offsets get accidentally clobbered and I restarted kafka-connect). Is this possible or should I write a custom