Exactly-once semantics in Dataflow stateful processing

房东的猫 提交于 2021-02-08 11:43:22

问题


We are trying to cover the following scenario in a streaming setting:

  • calculate an aggregate (let’s say a count) of user events since the start of the job
  • The number of user events is unbounded (hence only using local state is not an option)

I'll discuss three options we are considering, where the two first options are prone to dataloss and the final one is unclear. We'd like to get more insight into this final one. Alternative approaches are of course welcome too.

Thanks!


Approach 1: Session windows, datastore and Idempotency

  1. Sliding windows of x seconds
  2. Group by userid
  3. update datastore

Update datastore would mean:

  1. Start trx
  2. datastore read for this user
  3. Merging in new info
  4. datastore write
  5. End trx

The datastore entry contains an idempotency id that equals the sliding window timestamp

Problem:

Windows can be fired concurrently, and then can hence be processed out of order leading to dataloss (confirmed by Google)

Approach: Session windows, datastore and state

  1. Sliding windows of x seconds
  2. Group by userid
  3. update datastore

Update datastore would mean:

  1. Pre-check: check if state for this key-window is true, if so we skip the following steps
  2. Start trx
  3. datastore read for this user
  4. Merging in new info
  5. datastore write
  6. End trx
  7. Store in state for this key-window that we processed it (true)

Re-execution will hence skip duplicate updates

Problem:

Failure between 5 and 7 will not write to local state, causing re-execution and potentially counting elements twice. We can circumvent this by using multiple states, but then we could still drop data.

Approach 3: Global window, timers and state

Based on the article Timely (and Stateful) Processing with Apache Beam, we would create:

  1. A global window
  2. Group by userid
  3. Buffer/count all incoming events in a stateful DoFn
  4. Flush x time after the first event.

A flush would mean the same as Approach 1

Problem:

The guarantees for exactly-once processing and state are unclear. What would happen if an element was written in the state and a bundle would be re-executed? Is state restored to before that bundle?

Any links to documentation in this regard would be very much appreciated. E.g. how does fault-tolerance work with timers?


回答1:


From your Approach 1 and 2 it is unclear whether out-of-order merging is a concern or loss of data. I can think of the following.

Approach 1: Don't immediately merge the session window aggregates because of out of order problem. Instead, store them separately and after sufficient amount of time, you can merge the intermediate results in timestamp order.

Approach 2: Move the state into the transaction. This way, any temporary failure will not let the transaction complete and merge the data. Subsequent successful processing of the session window aggregates will not result in double counting.



来源:https://stackoverflow.com/questions/58401303/exactly-once-semantics-in-dataflow-stateful-processing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!