apache-kafka-connect

Is there a way to configure kafka-connect jmx metrics to be captured using a jmx_exporter/prometheus?

蓝咒 提交于 2020-01-06 06:42:29
问题 I'm setting up monitoring for Kafka connect in our Kafka ecosystem. I have enabled JMX exporter for kafka brokers and is working fine. Now I am trying to enable JMX exporter for kafka connect. However, it is a bit unclear where to start. I can only modify connect-distributed.sh to enable the change. Any pointers would be a great addition. kafka-run-class.sh was modified to enable jmx_exporter to emit jmx metrics on http://<host>:9304/metrics I expect kafka-connect to emit metrics on http://

Can the kafka connectors be configured via env variables passed when launching docker? Or curl is the only way?

浪尽此生 提交于 2020-01-05 05:50:10
问题 This is the docker image we use to host docker-connect with the plugins FROM confluentinc/cp-kafka-connect:5.3.1 ENV CONNECT_PLUGIN_PATH=/usr/share/java # JDBC-MariaDB RUN wget -nv -P /usr/share/java/kafka-connect-jdbc/ https://downloads.mariadb.com/Connectors/java/connector-java-2.4.4/mariadb-java-client-2.4.4.jar # SNMP Source RUN wget -nv -P /tmp/ https://github.com/name/kafka-connect-snmp/releases/download/0.0.1.11/kafka-connect-snmp-0.0.1.11.tar.gz RUN mkdir /tmp/kafka-connect-snmp &&

Can a Kafka Connector load its own name?

て烟熏妆下的殇ゞ 提交于 2020-01-05 05:17:10
问题 According to Kafka Documentation Connector configurations are simple key-value mappings. For standalone mode these are defined in a properties file and passed to the Connect process on the command line. Most configurations are connector dependent, so they can't be outlined here. However, there are a few common options: name - Unique name for the connector. Attempting to register again with the same name will fail. I have 10 connectors running in standalone mode like this: bin/connect

Reset the JDBC Kafka Connector to start pulling rows from the beginning of time?

天大地大妈咪最大 提交于 2020-01-03 09:09:22
问题 The Kafka Connector can make use of a primary key and a timestamp to determine which rows need to be processed. I'm looking for a way to reset the Connector so that it will process from the beginning of time. 回答1: Because the requirement is to run in distributed mode, the easiest thing to do is to update the connector name to a new value. This will prompt a new entry to be made into the connect-offsets topic as it looks like a totally new connector. Then the connector should start reading

Kafka Streams table transformations

梦想的初衷 提交于 2020-01-03 01:59:07
问题 I've got a table in SQL Server that I'd like to stream to Kafka topic, the structure is as follows: (UserID, ReportID) This table is going to be continuously changed (records added, inserted, no updates) I'd like to transform this into this kind of structure and put into Elasticsearch: { "UserID": 1, "Reports": [1, 2, 3, 4, 5, 6] } Examples I've seen so far are logs or click-stream which and do not work in my case. Is this kind of use case possible at all? I could always just look at UserID

When does Kafka Leader Election happen?

喜你入骨 提交于 2020-01-02 02:42:10
问题 When and how often does Kafka High Level Producer elect a leader? Does it do before sending each message or only once at the time of creating connection? 回答1: Every broker has a information about the list of topics(and partitions) and their leaders which will be kept up to date by the zoo keeper whenever the new leader is elected or when the number of partition changes. Thus, when the producer makes a call to one of the brokers, it responds with this information list. Once the producer

Kafka streams - First example WordCount doesn't count correctly the first lap

╄→гoц情女王★ 提交于 2019-12-25 09:09:02
问题 I'm studying Kafka Streams and I have a problem with the first example of WordCount in Java 8, taken from the documentation. Using the latest available versions of kafka streams, Kafka Connect and WordCount lambda expressions example. I follow the following steps: I create an input topic in Kafka, and an output one. Start the app streaming and then uploading the input topic by inserting some words from a .txt file On the first count, in the output topic I see the words grouped correctly, but

How to transform all timestamp fields when using Kafka Connect?

老子叫甜甜 提交于 2019-12-25 02:53:55
问题 I am trying to convert all timestamp fields to a string type with the format yyyy-MM-dd HH:mm:ss . To transform multiple fields, I have to create a transform for each one individually. ... "transforms":"tsFormat1,tsFormat2,...,tsFormatN", "transforms.tsFormat1.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value", "transforms.tsFormat1.target.type": "string", "transforms.tsFormat1.field": "ts_col1", "transforms.tsFormat1.format": "yyyy-MM-dd HH:mm:ss", "transforms.tsFormat2

org.apache.kafka.connect.errors.DataException: Invalid JSON for record default value: null

跟風遠走 提交于 2019-12-25 01:45:49
问题 I have a Kafka Avro Topic generated using KafkaAvroSerializer. My standalone properties are as below. I am using Confluent 4.0.0 to run Kafka connect. key.converter=io.confluent.connect.avro.AvroConverter value.converter=io.confluent.connect.avro.AvroConverter key.converter.schema.registry.url=<schema_registry_hostname>:8081 value.converter.schema.registry.url=<schema_registry_hostname>:8081 key.converter.schemas.enable=true value.converter.schemas.enable=true internal.key.converter=org

Cast numeric fields with kafka connect and table.whitelist

元气小坏坏 提交于 2019-12-24 20:47:54
问题 I have created a source and a sink connector for kafka connect Confluent 5.0, to push two sqlserver tables to my datalake Here is my SQLServer table schema : CREATE TABLE MYBASE.dbo.TABLE1 ( id_field int IDENTITY(1,1) NOT NULL, my_numericfield numeric(24,6) NULL, time_field smalldatetime NULL, CONSTRAINT PK_CBMARQ_F_COMPTEGA PRIMARY KEY (id_field) ) GO My Cassandra schema : create table TEST-TABLE1(my_numericfield decimal, id_field int, time_field timestamp, PRIMARY KEY (id_field)); Here is