apache-kafka-connect | 易学教程

How to connect Kafka with Elasticsearch?

阅读更多关于 How to connect Kafka with Elasticsearch?

问题 I am new in Kafka, I use kafka to collect netflow through logstash(it is ok), and I want to send the data to elasticsearch from kafka, but there are some problems. My question is how can I connect Kafka with Elasticsearch? netflow to kafka logstash config: input{ udp{ host => "120.127.XXX.XX" port => 5556 codec => netflow } } filter{ } output { kafka { bootstrap_servers => "localhost:9092" topic_id => "test" } stdout{codec=> rubydebug} } kafka to elasticsearch logstash: input { kafka { } }

How to use Kafka Connect for Cassandra without Confluent

阅读更多关于 How to use Kafka Connect for Cassandra without Confluent

问题 How we can use Kafka Connect with Cassandra without using the Confluent frameworks. 回答1: Kafka Connect is the framework. Confluent only offers connectors . If you don't want to use Confluent Open Source (but why wouldn't you?), you can use all those connectors with vanilla Apache Kafka, too. There are multiple Casandra connectors available: https://www.confluent.io/product/connectors/ Btw: none of the listed Casandra connectors is maintained by Confluent. Of course, you could also write you

What is a simple, effective way to debug custom Kafka connectors?

阅读更多关于 What is a simple, effective way to debug custom Kafka connectors?

问题 I'm working a couple of Kafka connectors and I don't see any errors in their creation/deployment in the console output, however I am not getting the result that I'm looking for (no results whatsoever for that matter, desired or otherwise). I made these connectors based on Kafka's example FileStream connectors, so my debug technique was based off the use of the SLF4J Logger that is used in the example. I've searched for the log messages that I thought would be produced in the console output,

kafka connector debezium mongodb CDC update/$set message without filter(_id value)

阅读更多关于 kafka connector debezium mongodb CDC update/$set message without filter(_id value)

问题 i am trying to setup syncing from mongodb to kudu with debezium mongodb connector. but as debezium doc tell and also i tried by myself and found, there are no filter(_id value) for debezium mongodb CDC update/$set message. { "after": null, "patch": "{\"$v\" : 1,\"$set\" : {\"_upts_ratio_average_points\" : {\"$numberLong\" : \"1564645156749\"},\"updatets\" : {\"$numberLong\" : \"1564645156749\"}}}", "source": { "version": "0.9.5.Final", "connector": "mongodb", "name": "promongodbdeb05", "rs":

How to change the “kafka connect” component port?

阅读更多关于 How to change the “kafka connect” component port?

On port 8083 I am running Influxdb for which I am even getting the GUI on http://localhost:8083 Now come to kafka, Here I am referring the setup as per https://kafka.apache.org/quickstart starting the zookeeeper which is in folder /opt/zookeeper-3.4.10 by the command: bin/zkServer.sh start So zookeeper is started now starting kafka under /opt/kafka_2.11-1.1.0 folder as : bin/kafka-server-start.sh config/server.properties create a topic named "test" with a single partition and only one replica: bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic

Packaging a custom Java `partitioner.class` plugin for Kafka Connect in Confluent 4.1 + Kafka 1.1?

阅读更多关于 Packaging a custom Java `partitioner.class` plugin for Kafka Connect in Confluent 4.1 + Kafka 1.1?

问题 I've been successfully using a simple custom Partitioner class written in Java for a Kafka Connect sink on Confluent 3.2.x (Kafka 0.10.x). I want to upgrade to Confluent 4.1 (Kafka 1.1) and am experiencing errors. Kafka Connect's plugin loading mechanism seems to have been changed in CP 3.3.0. Previously, there was just the CLASSPATH option, but with CP 3.3.0+ there is a newer and recommended plugin.path mechanism. If I try to keep using the legacy CLASSPATH plugin mechanism, when I try to

Kafka sink connector: No tasks assigned, even after restart

阅读更多关于 Kafka sink connector: No tasks assigned, even after restart

问题 I am using Confluent 3.2 in a set of Docker containers, one of which is running a kafka-connect worker. For reasons yet unclear to me, two of my four connectors - to be specific, hpgraphsl's MongoDB sink connector - stopped working. I was able to identify the main problem: The connectors did not have any tasks assigned, as could be seen by calling GET /connectors/{my_connector}/status . The other two connectors (of the same type) were not affected and were happily producing output. I tried

Kafka Connect : How to fetch nested fields from Struct

阅读更多关于 Kafka Connect : How to fetch nested fields from Struct

I am using Kafka-Connect to implement a Kafka-Elasticsearch connector. The producer sent a complex JSON on to a Kafka Topic and my connector code will use this to persist to Elastic search. The connector get the data in form of Struct( https://kafka.apache.org/0100/javadoc/org/apache/kafka/connect/data/Struct.html ). I am able to get the field values of struct at top level Json but not able to fetch from nested jsons. { "after": { "test.test.employee.Value": { "id": 5671111, "name": { "string": "abc" } } }, "op": "u", "ts_ms": { "long": 1474892835943 } } I am able to parse "op", but not "test

Restarting Kafka Connect S3 Sink Task Loses Position, Completely Rewrites everything

阅读更多关于 Restarting Kafka Connect S3 Sink Task Loses Position, Completely Rewrites everything

问题 After restarting a Kafka Connect S3 sink task, it restarted writing all the way from the beginning of the topic and wrote duplicate copies of older records. In other words, Kafka Connect seemed to lose its place. So, I imagine that Kafka Connect stores current offset position information in the internal connect-offsets topic. That topic is empty which I presume is part of the problem. The other two internal topics connect-statuses and connect-configs are not empty. connect-statuses has 52

Debezium flush timeout and OutOfMemoryError errors with MySQL

阅读更多关于 Debezium flush timeout and OutOfMemoryError errors with MySQL

Using Debezium 0.7 to read from MySQL but getting flush timeout and OutOfMemoryError errors in the initial snapshot phase. Looking at the logs below it seems like the connector is trying to write too many messages in one go: WorkerSourceTask{id=accounts-connector-0} flushing 143706 outstanding messages for offset commit [org.apache.kafka.connect.runtime.WorkerSourceTask] WorkerSourceTask{id=accounts-connector-0} Committing offsets [org.apache.kafka.connect.runtime.WorkerSourceTask] Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space WorkerSourceTask{id