apache-kafka | 易学教程

How to increase the number of messages consumed by Spring Kafka Consumer in each batch?

阅读更多关于 How to increase the number of messages consumed by Spring Kafka Consumer in each batch?

问题 I am building a Kafka Consumer application that consumes messages from a Kafka Topic and performs a database update task. The messages are produced in a large batch once every day - so the Topic has about 1 million messages loaded in 10 minutes. The Topic has 8 partitions. The Spring Kafka Consumer (annotated with @KafkaListener and using a ConcurrentKafkaListenerContainerFactory) is triggered in very short batches. The batch size is sometimes just 1 or 2 messages. It would help performance

After kafka crashed, the offsets are lost

阅读更多关于 After kafka crashed, the offsets are lost

问题 Our kafka system crashed because no disk space was available. The consumers are Spring boot application which are using the Kafka Streams API. Now every consumer application shows the following error: java.io.FileNotFoundException: /tmp/kafka-streams/908a79bc-92e7-4f9c-a63a-5030cf4d3555/streams.device-identification-parser/0_48/.checkpoint.tmp (No such file or directory) This exception occurred exactly after the kafka server was restarted. If we restart the application, the service starts at

Kafka Streams: Punctuate vs Process

阅读更多关于 Kafka Streams: Punctuate vs Process

问题 In a single task within the stream app, does the following two methods run independently (meaning while the method "process" is handling an incoming message from the upstream source, the method "punctuate" can also run in parallel based on the specified schedule and WALL_CLOCK_TIME as the PunctuationType?) OR do they share same thread so it's either one that runs at a given time, if so would the punctuate method never gets invoked if the process method keeps continuously getting messages from

How Logstash is different than Kafka

阅读更多关于 How Logstash is different than Kafka

问题 How Log stash is different than Kafka? and if both are same which is better? and How? I found both are the pipelines where one can push the data for further processing. 回答1: Kafka is much more powerful than Logstash. For syncing data from such as PostgreSQL to ElasticSearch, Kafka connectors could do the similar work with Logstash. One key difference is: Kafka is a cluster, while Logstash is basically single instance. You could run multiple Logstash instances. But these Logstash instances are

How Logstash is different than Kafka

阅读更多关于 How Logstash is different than Kafka

Kafka: Delete idle consumer group id

阅读更多关于 Kafka: Delete idle consumer group id

问题 In some cases, I use Kafka-stream to model a small in memory (hashmap) projection of a topic. The K,V cache does require some manipulations, so it is not a good case for a GlobalKTable. In such a “caching” scenario, I want all my sibling instances to have the same cache, so I need to bypass the consumer-group mechanism. To enable this, I normally simply start my apps with a randomly generated application Id, so each app will reload the topic each time it restarts. The only caveat to that is

Send two Serialized Java objects under one Kafka Topic

阅读更多关于 Send two Serialized Java objects under one Kafka Topic

问题 I want to implement Kafka Consumer and Producer which sends and receives Java Objects. Full Source I tried this: Producer: @Configuration public class KafkaProducerConfig { @Value(value = "${kafka.bootstrapAddress}") private String bootstrapAddress; @Bean public ProducerFactory<String, SaleRequestFactory> saleRequestFactoryProducerFactory() { Map<String, Object> configProps = new HashMap<>(); configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress); configProps.put

Send two Serialized Java objects under one Kafka Topic

阅读更多关于 Send two Serialized Java objects under one Kafka Topic

How to get Kafka messages based on timestamp

阅读更多关于 How to get Kafka messages based on timestamp

问题 I am working on a application in which I am using kafka and tech is scala. My kafka consumer code is as follows: val props = new Properties() props.put("group.id", "test") props.put("bootstrap.servers", "localhost:9092") props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer") props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer") props.put("auto.offset.reset", "earliest") props.put("group.id", "consumer-group") val

Does kafka distinguish between consumed offset and commited offset?

阅读更多关于 Does kafka distinguish between consumed offset and commited offset?

问题 From what I understand a consumer reads messages off a particular topic, and the consumer client will periodically commit the offset. So if for some reason the consumer fails a particular message, that offset won't be committed and you can then go back and reprocess he message. Is there anything that tracks the offset you just consumed and the offset you then commit? 回答1: Does kafka distinguish between consumed offset and commited offset? Yes, there is a big difference. The consumed offset is