问题
I have an Apache Flink application that reads from a single Kafka topic. I would like to update the application from time to time without experiencing downtime. For now the Flink application executes some simple operators such as map and some synchronous IO to external systems via http rest APIs.
I have tried to use the stop command, but i get "Job termination (STOP) failed: This job is not stoppable.", I understand that the Kafka connector does not support the the stop behavior - a link! A simple solution would be to cancel with savepoint and to redeploy the new jar with the savepoint, but then we get downtime. Another solution would be to control the deployment from the outside, for example, by switching to a new topic.
what would be a good practice ?
回答1:
If you don't need exactly-once output (i.e., can tolerate some duplicates) you can take a savepoint without cancelling the running job. Once the savepoint is completed, you start a second job. The second job could write to different topic but doesn't have to. When the second job is up, you can cancel the first job.
来源:https://stackoverflow.com/questions/56476638/how-to-deploy-a-new-job-without-downtime