How do I upsert into HDFS with spark?

后端 未结 1 827
一生所求
一生所求 2020-12-19 05:19

I have partitioned data in the HDFS. At some point I decide to update it. The algorithm is:

  • Read the new data from a kafka topic.
  • Find out new data\'s
相关标签:
1条回答
  • 2020-12-19 05:33

    In the end I just decided to delete that "green" subset of partitions from HDFS, and use SaveMode.Append instead. I think this is a bug in spark.

    0 讨论(0)
提交回复
热议问题