Spark Structured Streaming Multiple WriteStreams to Same Sink

岁酱吖の 提交于 2020-12-29 13:55:40

问题


Two Writestream to the same database sink is not happening in sequence in Spark Structured Streaming 2.2.1. Please suggest how to make them execute in sequence.

val deleteSink = ds1.writestream
  .outputMode("update")
  .foreach(mydbsink)
  .start()

val UpsertSink = ds2.writestream
  .outputMode("update")
  .foreach(mydbsink)
  .start()

deleteSink.awaitTermination()
UpsertSink.awaitTermination()

Using the above code, deleteSink is executed after UpsertSink.


回答1:


If you want to have two streams running in parallel, you have to use

sparkSession.streams.awaitAnyTermination()

instead of

deleteSink.awaitTermination()
UpsertSink.awaitTermination()

In your case UpsertSink will never start unless deleteSink will be stopped or an exception thrown, as it says in the scaladoc

Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown. If the query has terminated, then all subsequent calls to this method will either return immediately (if the query was terminated by stop()), or throw the exception immediately (if the query has terminated with exception).



来源:https://stackoverflow.com/questions/50791975/spark-structured-streaming-multiple-writestreams-to-same-sink

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!