apache-kafka-connect

How to connect Kafka with Elasticsearch?

拥有回忆 提交于 2019-12-07 01:06:04
问题 I am new in Kafka, I use kafka to collect netflow through logstash(it is ok), and I want to send the data to elasticsearch from kafka, but there are some problems. My question is how can I connect Kafka with Elasticsearch? netflow to kafka logstash config: input{ udp{ host => "120.127.XXX.XX" port => 5556 codec => netflow } } filter{ } output { kafka { bootstrap_servers => "localhost:9092" topic_id => "test" } stdout{codec=> rubydebug} } kafka to elasticsearch logstash: input { kafka { } }

How to use Kafka Connect for Cassandra without Confluent

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 00:01:05
问题 How we can use Kafka Connect with Cassandra without using the Confluent frameworks. 回答1: Kafka Connect is the framework. Confluent only offers connectors . If you don't want to use Confluent Open Source (but why wouldn't you?), you can use all those connectors with vanilla Apache Kafka, too. There are multiple Casandra connectors available: https://www.confluent.io/product/connectors/ Btw: none of the listed Casandra connectors is maintained by Confluent. Of course, you could also write you

What is a simple, effective way to debug custom Kafka connectors?

流过昼夜 提交于 2019-12-06 23:49:04
问题 I'm working a couple of Kafka connectors and I don't see any errors in their creation/deployment in the console output, however I am not getting the result that I'm looking for (no results whatsoever for that matter, desired or otherwise). I made these connectors based on Kafka's example FileStream connectors, so my debug technique was based off the use of the SLF4J Logger that is used in the example. I've searched for the log messages that I thought would be produced in the console output,

kafka connector debezium mongodb CDC update/$set message without filter(_id value)

送分小仙女□ 提交于 2019-12-06 14:04:12
问题 i am trying to setup syncing from mongodb to kudu with debezium mongodb connector. but as debezium doc tell and also i tried by myself and found, there are no filter(_id value) for debezium mongodb CDC update/$set message. { "after": null, "patch": "{\"$v\" : 1,\"$set\" : {\"_upts_ratio_average_points\" : {\"$numberLong\" : \"1564645156749\"},\"updatets\" : {\"$numberLong\" : \"1564645156749\"}}}", "source": { "version": "0.9.5.Final", "connector": "mongodb", "name": "promongodbdeb05", "rs":

How to change the “kafka connect” component port?

…衆ロ難τιáo~ 提交于 2019-12-06 11:12:09
On port 8083 I am running Influxdb for which I am even getting the GUI on http://localhost:8083 Now come to kafka, Here I am referring the setup as per https://kafka.apache.org/quickstart starting the zookeeeper which is in folder /opt/zookeeper-3.4.10 by the command: bin/zkServer.sh start So zookeeper is started now starting kafka under /opt/kafka_2.11-1.1.0 folder as : bin/kafka-server-start.sh config/server.properties create a topic named "test" with a single partition and only one replica: bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic

Packaging a custom Java `partitioner.class` plugin for Kafka Connect in Confluent 4.1 + Kafka 1.1?

家住魔仙堡 提交于 2019-12-06 11:09:19
问题 I've been successfully using a simple custom Partitioner class written in Java for a Kafka Connect sink on Confluent 3.2.x (Kafka 0.10.x). I want to upgrade to Confluent 4.1 (Kafka 1.1) and am experiencing errors. Kafka Connect's plugin loading mechanism seems to have been changed in CP 3.3.0. Previously, there was just the CLASSPATH option, but with CP 3.3.0+ there is a newer and recommended plugin.path mechanism. If I try to keep using the legacy CLASSPATH plugin mechanism, when I try to

Kafka sink connector: No tasks assigned, even after restart

只谈情不闲聊 提交于 2019-12-06 07:14:00
问题 I am using Confluent 3.2 in a set of Docker containers, one of which is running a kafka-connect worker. For reasons yet unclear to me, two of my four connectors - to be specific, hpgraphsl's MongoDB sink connector - stopped working. I was able to identify the main problem: The connectors did not have any tasks assigned, as could be seen by calling GET /connectors/{my_connector}/status . The other two connectors (of the same type) were not affected and were happily producing output. I tried

Kafka Connect : How to fetch nested fields from Struct

爱⌒轻易说出口 提交于 2019-12-06 04:41:44
I am using Kafka-Connect to implement a Kafka-Elasticsearch connector. The producer sent a complex JSON on to a Kafka Topic and my connector code will use this to persist to Elastic search. The connector get the data in form of Struct( https://kafka.apache.org/0100/javadoc/org/apache/kafka/connect/data/Struct.html ). I am able to get the field values of struct at top level Json but not able to fetch from nested jsons. { "after": { "test.test.employee.Value": { "id": 5671111, "name": { "string": "abc" } } }, "op": "u", "ts_ms": { "long": 1474892835943 } } I am able to parse "op", but not "test

Restarting Kafka Connect S3 Sink Task Loses Position, Completely Rewrites everything

不问归期 提交于 2019-12-06 00:26:16
问题 After restarting a Kafka Connect S3 sink task, it restarted writing all the way from the beginning of the topic and wrote duplicate copies of older records. In other words, Kafka Connect seemed to lose its place. So, I imagine that Kafka Connect stores current offset position information in the internal connect-offsets topic. That topic is empty which I presume is part of the problem. The other two internal topics connect-statuses and connect-configs are not empty. connect-statuses has 52

Debezium flush timeout and OutOfMemoryError errors with MySQL

心已入冬 提交于 2019-12-05 18:37:19
Using Debezium 0.7 to read from MySQL but getting flush timeout and OutOfMemoryError errors in the initial snapshot phase. Looking at the logs below it seems like the connector is trying to write too many messages in one go: WorkerSourceTask{id=accounts-connector-0} flushing 143706 outstanding messages for offset commit [org.apache.kafka.connect.runtime.WorkerSourceTask] WorkerSourceTask{id=accounts-connector-0} Committing offsets [org.apache.kafka.connect.runtime.WorkerSourceTask] Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space WorkerSourceTask{id