Benefits with Dataflow over cloud functions when moving data?

拈花ヽ惹草 提交于 2019-12-19 08:08:08

问题


I'm relatively new to GCP and just starting to setup/evaluate my organizations architecture on GCP.

Scenario:
Data will flow into a pub/sub topic (high frequency, low amount of data). The goal is to move that data into Big Table. From my understanding you can do that either with a having a cloud function triggering on the topic or with Dataflow.

Now I have previous experience with cloud functions which I am satisfied with, so that would be my pick.

I fail to see the benefit of choosing one over the other. So my question is when to choose what of these products?

Thanks


回答1:


Both solutions could work. Dataflow will scale better if your pub/sub traffic grows to large amounts of data, but Cloud Functions should work fine for low amounts of data; I would look at this page (especially the rate-limit section) to ensure that you fit within Cloud Functions: https://cloud.google.com/functions/quotas

Another thing to consider is that Dataflow can guarantee exactly-once processing of your data, so that no duplicates end up in BigTable. Cloud Functions will not do this for you out of the box. If you go with a functions approach, then you will want to make sure that the Pub/Sub message consistently determines which BigTable cell is written to; that way, if the function gets retried several times the same data will simply overwrite the same BigTable cell.




回答2:


Your needs sound relatively straightforward and Dataflow may be overkill for what you're trying to do. If Cloud functions do what you need they maybe stick with that. Often I find that simplicity is key when it comes to maintainability.

However when you need to perform transformations like merging these events by user before storing them in BigTable, that's where Dataflow really shines:

https://beam.apache.org/documentation/programming-guide/#groupbykey



来源:https://stackoverflow.com/questions/51197653/benefits-with-dataflow-over-cloud-functions-when-moving-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!