Flink with Ceph as the persistent storage

我的未来我决定 提交于 2019-12-10 17:52:17

问题


Flink documents suggests that Ceph can be used as a persistent storage for states. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html

Considering that Ceph is a transactional database, wouldn't it have adverse effect on Flink's performance?


回答1:


Ceph describes itself as a "unified, distributed storage system" and provides a network file system API. As such, it such should be seamlessly working with Flink's state backends that persist checkpoints to a remote file system.

I'm not aware of people using Ceph (HDFS and S3 are more commonly used) and have no information about the performance. However, note that Flink is able to write checkpoints asynchronously, such that the performance of the storage system does not affect the processing speed of a Flink application. It might however, constrain the interval in which checkpoints are taken.

Update: (Feb. 2018) I noticed that multiple users reported on Flink's user mailing list that they are using Ceph with Flink.



来源:https://stackoverflow.com/questions/47652024/flink-with-ceph-as-the-persistent-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!