Read Kafka topic in a Spark batch job

后端未结

关注

 1  896

I\'m writing a Spark (v1.6.0) batch job which reads from a Kafka topic.
For this I can use org.apache.spark.streaming.kafka.KafkaUtils#createRDD however, I

相关标签:

1条回答

傲寒

2020-12-18 02:22

createRDD is the right approach for reading a batch from kafka.

To query for info about the latest / earliest available offsets, look at KafkaCluster.scala methods getLatestLeaderOffsets and getEarliestLeaderOffsets. That file was private, but should be public in the latest versions of spark.

0 讨论(0)
发布评论:

提交评论
- 加载中...