apache-kafka-connect

Kafka Connector - distributed - load balancing tasks

狂风中的少年 提交于 2019-12-11 18:04:18
问题 I am running development environment for Confluent Kafka, Community edition on Windows, version 3.0.1-2.11. I am trying to achieve load balancing of tasks between 2 instances of connector. I am running Kafka Zookepper, Server, REST services and 2 instance of Connect distributed on the same machine. Only difference between properties file for connectors is rest port since they are running on the same machine. I don't create topics for connector offsets, config, status. Should I? I have custom

Stop the kafka connector

不想你离开。 提交于 2019-12-11 17:48:24
问题 In kafka connect SinkTask implementation, If there is an exception which is unavoidable, if I invoke stop() method from my code, does the connector stop altogether ? 回答1: Only that task that encountered the exception will stop The connect cluster can have multiple connectors which won't stop, and there can be other tasks for a single connect depending on your configurations that, for example, would be assigned different, clean, data that could be processed 来源: https://stackoverflow.com

Kafka Connect cassandra source -error for Decimal data type

白昼怎懂夜的黑 提交于 2019-12-11 17:47:10
问题 I am using kafka connect cassandra source connector 1.0 version. I have a decimal datatype column(price) in cassandra table and writing it to the kafka topic as json from source connector,it is writing the decimal value in some string format like like "price":"AA==" . Now it is giving error in my spark streaming while converting to float as "number format exception"....?? please suggest what may went wrong while writing the value in kafka topic. Advance thanks. 回答1: It looks like the known

kafka jdbc sink connector standalone error

江枫思渺然 提交于 2019-12-11 17:41:59
问题 I am trying to insert data into a postgres database from a topic in kafka. I am using the following command to load ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/sink-quickstart-mysql.properties The sink-quickstart-mysql.properties is as follows name=test-sink-mysql-jdbc-autoincrement connector.class=io.confluent.connect.jdbc.JdbcSinkConnector tasks.max=1 topics=third_topic connection.url=jdbc:postgres://localhost:5432/postgres

Adding 'for' loop with delay in kafka streams

孤者浪人 提交于 2019-12-11 16:54:30
问题 Below is the code written to get the output as follows. my input will be in the form of JSON arrays, to seperate JSON arrays into Json objects i wrote the following code KStreamBuilder builder = new KStreamBuilder(); KStream<String, String> textlines = builder.stream("INTOPIC"); KStream<String, String> mstream = textlines .mapValues(value -> value.replace("[","" ) ) .mapValues(value -> value.replace("]","" ) ) .mapValues(value -> value.replaceAll("\\},\\{" ,"\\}\\},\\{\\{")) .flatMapValues

Kafka Connect and Kafka Broker version compatibility

心不动则不痛 提交于 2019-12-11 16:41:38
问题 We have a "Kerberized Kafka cluster" running brokers version Apache Kafka 0.11.This cluster is managed by a different team and we dont have any control over this. We are now trying to install Kafka Connect cluster on our own K8S cluster. We were following this compatibility matrix https://docs.confluent.io/current/installation/versions-interoperability.html According to this, we had to stick with Confluent Platform 3.3.3 images for Schema Registry and Kafka Connect pods since the Brokers

Kafka: Does Confluent’s HDFS connector support Snappy compression?

此生再无相见时 提交于 2019-12-11 16:20:07
问题 I don't see any configurations for compression on the HDFS connector docs https://docs.confluent.io/current/connect/connect-hdfs/docs/configuration_options.html. Does it support compression? If yes, what do I need to add in the properties file? 回答1: Snappy compression was recently added to the HDFS Connector for Avro. To enable it you'll need to set the property avro.codec to snappy . With Parquet it's been available since the beginning and it is the codec used when exporting parquet files.

Kafka connect internals - how connectors and tasks got deployed around the connect cluster

橙三吉。 提交于 2019-12-11 15:27:59
问题 I use Kafka Connect for different purposes and it's working fine. It's more a curiosity question. Trying to figure out reading the code might take some time, so I'm asking here.. ( but I'll try read Kafka code anyway..) I know a Connector is the one responsible for giving/updating configurations for the tasks, but what it is exactly ? Is it some piece of code that will run over the Connect cluster ? If yes, I imagine a worker did initiate it, but it does it arbitrary on one worker JVM ?

PubSub Kafka Connect node connection end of file exception

半腔热情 提交于 2019-12-11 15:17:49
问题 While running PubSub Kafka connect using the command: .\bin\windows\connect-standalone.bat .\etc\kafka\WorkerConfig.properties .\etc\kafka\configSink.properties .\etc\kafka\configSource.properties I get this error: Sending metadata request {topics=[test]} to node -1 could not scan file META-INF/MANIFEST.MF in url file:/C:/confluent-3.3.0/bin/../share/java/kafka-serde-tools/commons-compress-1.8.1.jar with scanner SubTypesScanner could not scan file META-INF/MANIFEST.MF in url file:/C:

Kafka connector and Schema Registry - Error Retrieving Avro Schema - Subject not found

会有一股神秘感。 提交于 2019-12-11 11:24:18
问题 I have a topic that will eventually have lots of different schemas on it. For now it just has the one. I've created a connect job via REST like this: { "name":"com.mycompany.sinks.GcsSinkConnector-auth2", "config": { "connector.class": "com.mycompany.sinks.GcsSinkConnector", "topics": "auth.events", "flush.size": 3, "my.setting":"bar", "key.converter":"org.apache.kafka.connect.storage.StringConverter", "key.deserializer":"org.apache.kafka.common.serialization.StringDerserializer", "value