apache-kafka

Unable to synchronise Kafka and MQ transactions usingChainedKafkaTransaction

自作多情 提交于 2021-02-11 17:50:43
问题 We have a spring boot application which consumes messages from IBM MQ does some transformation and publishes the result to a Kafka topic. We use https://spring.io/projects/spring-kafka for this. I am aware that Kafka does not supports XA; however, in the documentation I found some inputs about using a ChainedKafkaTransactionManager to chain multiple transaction managers and synchronise the transactions. The same documentation also provides an example about how to synchronise Kafka and

OutOfMemoryError when restart my Kafka Streams appplication

假如想象 提交于 2021-02-11 16:53:30
问题 I have a Kafka Streams app(Kafka Streams 2.1 + Kafka broker 2.0) which does a aggregation based on TimeWindows, and I use the suppress operator to supress the result's output. Everything works well until I restart my app, it will reset the offset of KTABLE-SUPPRESS-STATE-STORE to 0 to restore the suppression state, as expected. But each time I restart it, it will throw an OutOfMemoryError , I thought maybe the heap size is not enough, so I use a larger Xmx/Xms , it works one or two restart,

Calculate delta Offsets Kafka Java

。_饼干妹妹 提交于 2021-02-11 16:39:09
问题 In a spring project i used Kafka and now I want to make a method which takes "TopicName" and "GroupeId" as parameters and calculate the difference between "Lastoffsets of the topic partitions" and the "offsets consumed by the group" for the lastOffsets i get it now i need to get the consumed offsets to calculate the difference public ResponseEntity<Offsets> deltaoffsets (@RequestParam( name = "groupId") String groupId, @RequestParam( name = "topic") String topic) { Map<String,Object>

Calculate delta Offsets Kafka Java

别来无恙 提交于 2021-02-11 16:38:06
问题 In a spring project i used Kafka and now I want to make a method which takes "TopicName" and "GroupeId" as parameters and calculate the difference between "Lastoffsets of the topic partitions" and the "offsets consumed by the group" for the lastOffsets i get it now i need to get the consumed offsets to calculate the difference public ResponseEntity<Offsets> deltaoffsets (@RequestParam( name = "groupId") String groupId, @RequestParam( name = "topic") String topic) { Map<String,Object>

Calculate delta Offsets Kafka Java

百般思念 提交于 2021-02-11 16:37:15
问题 In a spring project i used Kafka and now I want to make a method which takes "TopicName" and "GroupeId" as parameters and calculate the difference between "Lastoffsets of the topic partitions" and the "offsets consumed by the group" for the lastOffsets i get it now i need to get the consumed offsets to calculate the difference public ResponseEntity<Offsets> deltaoffsets (@RequestParam( name = "groupId") String groupId, @RequestParam( name = "topic") String topic) { Map<String,Object>

Kafka messages getting lost when consumer goes down

人盡茶涼 提交于 2021-02-11 15:47:23
问题 Hello I am writing a kafka consumer-producer using spring cloud stream . Inside my consumer I save my data to a database , if the database goes down I will exit the application manually .After restarting application if the database is still down as a result the application gets stopped again . Now if i restart the application for the third time the messages received in the middle interval(the two failures) are lost, kafka consumer takes the latest message , also it skips the message on which

Error when connecting spark structured streaming + kafka

别说谁变了你拦得住时间么 提交于 2021-02-11 15:45:49
问题 im trying to connect my structured streaming spark 2.4.5 with kafka, but all the times that im trying this Data Source Provider errors appears. Follow my scala code and my sbt build: import org.apache.spark.sql._ import org.apache.spark.sql.types._ import org.apache.spark.sql.functions._ import org.apache.spark.sql.streaming.Trigger object streaming_app_demo { def main(args: Array[String]): Unit = { println("Spark Structured Streaming with Kafka Demo Application Started ...") val KAFKA_TOPIC

Kafks consumer.poll returns no data

余生颓废 提交于 2021-02-11 15:19:31
问题 I have two Kafka (2.11-0.11.0.1) brokers. Default replication factor of topics is set to 2. Producers write data only to zero partition. And I have scheduled executor which runs the task periodically. When it consumes a topic with a small number of records per minute (100 per minute) then in works like a charm. But for huge topics (10K per minute) method poll returns no data. The task is: import org.apache.kafka.clients.consumer.Consumer; import org.apache.kafka.clients.consumer

IIDR CDC with Transaction Details to Kafka

末鹿安然 提交于 2021-02-11 15:16:00
问题 We are working on a POC to keep our db data in sync with our in house DB2 to external MS-Sql. We are using QREP to replicate/ CDC. For the POC we are only using simple tables for now. Each table are sending the messages to their respective topics in Kafka(we are receiving it). Apart from these messages, we also need to capture Transaction details(DML Records, insert/update/delete). documentation - https://www.ibm.com/support/knowledgecenter/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/tasks

kafka JDBC sink with delete=true option do I have to use record_key?

醉酒当歌 提交于 2021-02-11 15:02:34
问题 I'd like to read from a multiple topics from cdc debezium from source postgres database, using a key from kafka message holding a primary keys. Then, the connector performs ETL operations in source database. When I set delete.enabled to true I cannot use kafka primary keys, it says I have to specify record_key and pk_fields . My idea is, set regex to read multiple desired topics, get table name from topic name and use primary keys holding by kafka topic, which is being currently read. name