how to deploy a new job without downtime

问题

I have an Apache Flink application that reads from a single Kafka topic. I would like to update the application from time to time without experiencing downtime. For now the Flink application executes some simple operators such as map and some synchronous IO to external systems via http rest APIs.

I have tried to use the stop command, but i get "Job termination (STOP) failed: This job is not stoppable.", I understand that the Kafka connector does not support the the stop behavior - a link! A simple solution would be to cancel with savepoint and to redeploy the new jar with the savepoint, but then we get downtime. Another solution would be to control the deployment from the outside, for example, by switching to a new topic.

what would be a good practice ?

回答1:

If you don't need exactly-once output (i.e., can tolerate some duplicates) you can take a savepoint without cancelling the running job. Once the savepoint is completed, you start a second job. The second job could write to different topic but doesn't have to. When the second job is up, you can cancel the first job.

来源：https://stackoverflow.com/questions/56476638/how-to-deploy-a-new-job-without-downtime

标签

apache-flink

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!