apache-kafka-connect

Kafka Connect JDBC vs Debezium CDC

北慕城南 提交于 2021-02-06 15:56:55
问题 What are the differences between JDBC Connector and Debezium SQL Server CDC Connector (or any other relational database connector) and when should I choose one over another, searching for a solution to sync between two relational databases? Not sure if this discussion should be about CDC vs JDBC Connector, and not Debezium SQL Server CDC Connector, or even just Debezium, looking forward for later editing, depends on the given answers (Though my case is about SQL Server sink). Sharing with you

How to split records into different streams, from one topic to different streams?

◇◆丶佛笑我妖孽 提交于 2021-02-05 12:07:18
问题 I have a single source CSV file containing records of different sizes that pushes every record into one source topic. I want to split the records into different KStreams/KTables from that source topic. I have a pipeline for one table load, where I am pushing the record from the source topic into stream1 in delimited format and then pushing the records into another stream in AVRO format which is then pushed into JDBC sink connector that pushes the record into MySQL database. The pipeline needs

How to split records into different streams, from one topic to different streams?

…衆ロ難τιáo~ 提交于 2021-02-05 12:05:14
问题 I have a single source CSV file containing records of different sizes that pushes every record into one source topic. I want to split the records into different KStreams/KTables from that source topic. I have a pipeline for one table load, where I am pushing the record from the source topic into stream1 in delimited format and then pushing the records into another stream in AVRO format which is then pushed into JDBC sink connector that pushes the record into MySQL database. The pipeline needs

Kafka Connect topic.key.ignore not works as expected

倖福魔咒の 提交于 2021-01-29 15:55:13
问题 As I understand from the documentation of kafka connect this configuration should ignore the keys for metricbeat and filebeat topic but not for alarms. But kafka connect does not ignore any key. So that's the fully json config that i pushing to kafka-connect over rest { "auto.create.indices.at.start": false, "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", "connection.url": "http://elasticsearch:9200", "connection.timeout.ms": 5000, "read.timeout.ms": 5000,

Kafka Connect SMT to add Kafka header fields

一曲冷凌霜 提交于 2021-01-29 13:59:01
问题 I need to find or write an SMT that will add header fields to a request. The request is missing some type fields and I want to add them. How exactly do you add a header within an SMT all I have seen are just record transforms like below but what if its the header I want to change or add a field to? private R applySchemaless(R record) { final Map<String, Object> value = requireMap(operatingValue(record), PURPOSE); // record.headers.add(Header) but how do I define the header // or record

ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone) java.lang.NoClassDefFoundError: io/debezium/util/IoUtil

风流意气都作罢 提交于 2021-01-29 13:53:34
问题 I'm trying to connect my SQL Server database to kafka on windows, i downloaded Debezium jar files, they are in a folder named debezium-connector-sqlserver, here's what it looks like : Kafka folder, Debezium folder I also added this line plugin.path=C:\\current_kafka_2.12-2.0.0\\debezium-connector-sqlserver in connect-standalone.properties, it's where i put all the jar files i downloaded from Debezium. i also created a file named connect-mssql.properties, and i put this in it. name=inventory

Failed to find any class that implements Connector and which name matches io.confluent.connect.mqtt.MqttSourceConnector

对着背影说爱祢 提交于 2021-01-29 08:33:55
问题 curl -s -X POST -H 'Content-Type: application/json' http://localhost:8083/connectors -d '{ > "name" : "mqtt-source", > "config" : { > "connector.class" : "io.confluent.connect.mqtt.MqttSourceConnector", > "tasks.max" : "1", > "mqtt.server.uri" : "tcp://10.1.78.100:1883", > "mqtt.topics" : "Essen/IMU/IMU01", > "kafka.topics" : "Essen.IMU.IMU01" > } >}' {"error_code":500,"message":"Failed to find any class that implements Connector and which name matches io.confluent.connect.mqtt

how to set kafka connect auto.offset.reset with rest api

佐手、 提交于 2021-01-29 08:30:58
问题 I have create a sink kafka connect that convert data to other storage; I want to set auto.offset.reset as latest when new connector is created with kafka connect rest api ; I have set consumer.auto.offset.reset: latest in configs; json { "name": "test_v14", "config": { "name": "test_v14", "consumer.auto.offset.reset": "latest", "connector.class": "...", ... } } But when task started, kafka consumer still poll records from earliest; So is any other ways to set auto.offset.reset as latest; 回答1:

kafka-connect file-pulse connector standalone could not start(class not found exception on offset manager)

笑着哭i 提交于 2021-01-29 06:59:46
问题 I want to use filepulse connector to load xml files to kafka. below are my environment : Win10 WSL , installed Ubuntu downloaded the confluent platform 5.5.1 (see "https://www.confluent.io/download/"), unpacked downloaded zip file version 1.5.2 from github (https://github.com/streamthoughts/kafka-connect-file-pulse/releases), unzipped modified the "connect-standalone.properties" located under confluent path (etc/kafka/connect-standalone.properties) to include path "/home/min/streamthoughts

Kafka Connect FileStreamSource ignores appended lines

回眸只為那壹抹淺笑 提交于 2021-01-29 05:50:43
问题 I'm working on an application to process logs with Spark and I thought to use Kafka as a way to stream the data from the log file. Basically I have a single log file (on the local file system) which is continuously updated with new logs, and Kafka Connect seems to be the perfect solution to get the data from the file along with the new appended lines. I'm starting the servers with their default configurations with the following commands: Zookeeper server: zookeeper-server-start.sh config