问题
I need to create an Airflow job that needs to run absolutely before 9h.
I currently have a job that starts at 7h, with retries=8 with 15 minutes interval (8*15m=2h) unfortunately, my job takes more time, and due to this, the task fails after 9h that is the hard deadline.
How can I make it do retry every 15 minutes but fail if it's after 9h so a human can take a look at the issue ?
Thanks for your help
回答1:
You could use the execution_timeout argument when creating the task to control how long it'll run before timing out. So if you run your task at 7AM, and want it to end at 9AM, then set the timeout to 2 hours. Below is info from Airflow documentation
aggregate_db_message_job = BashOperator(
task_id='aggregate_db_message_job',
execution_timeout=timedelta(hours=2),
pool='ep_data_pipeline_db_msg_agg',
bash_command=aggregate_db_message_job_cmd,
dag=dag)
aggregate_db_message_job.set_upstream(wait_for_empty_queue)
来源:https://stackoverflow.com/questions/54810074/airflow-retry-up-to-a-specific-time