How to get last two successful execution dates of Airflow job?

前端 未结 1 1053
余生分开走
余生分开走 2020-12-21 11:31

I need to get last two successful execution dates of Airflow job to use in my current run. Example : Execution date Job status 2020-05-03 success 2020-05-04 fail

相关标签:
1条回答
  • 2020-12-21 11:58

    You can leverage SQLAlchemy magic for retrieving execution_dates against last 'n' successfull runs

    from pendulum import Pendulum
    from typing import List, Dict, Any, Optional
    from airflow.utils.state import State
    from airflow.settings import Session
    from airflow.models.taskinstance import TaskInstance
    
    @provide_session
    def last_n_execution_dates(dag_id: str,
                               task_id: str,
                               n: int,
                               session: Optional[Session]) -> List[Pendulum]:
        task_instances: TaskInstance = (session
                                        .query(TaskInstance)
                                        .filter(TaskInstance.dag_id == dag_id,
                                                TaskInstance.task_id == task_id,
                                                TaskInstance.state == State.SUCCESS)
                                        .order_by(TaskInstance.execution_date.desc())
                                        .limit(n)
                                        .all())
        execution_dates: List[Pendulum] = list(map(lambda ti: ti.execution_date, task_instances))
        return execution_dates
    

    Note that the snippet is for reference purpose only and is untested

    I've referred to tree() method of views.py for coming up with this script.


    Alternatively, you can fire this SQL query to the Airflow's meta-db to retrieve last n execution dates with successful runs

    SELECT execution_date
    FROM task_instance
    WHERE dag_id = 'my_dag_id'
      AND task_id = 'my_task_id'
      AND state = 'success'
    ORDER BY execution_date DESC
    LIMIT n
    
    0 讨论(0)
提交回复
热议问题