Read Kafka topic in a Spark batch job

后端 未结 1 896
梦谈多话
梦谈多话 2020-12-18 01:48

I\'m writing a Spark (v1.6.0) batch job which reads from a Kafka topic.
For this I can use org.apache.spark.streaming.kafka.KafkaUtils#createRDD however, I

相关标签:
1条回答
  • 2020-12-18 02:22

    createRDD is the right approach for reading a batch from kafka.

    To query for info about the latest / earliest available offsets, look at KafkaCluster.scala methods getLatestLeaderOffsets and getEarliestLeaderOffsets. That file was private, but should be public in the latest versions of spark.

    0 讨论(0)
提交回复
热议问题