Airflow 1.9.0 ExternalTaskSensor retry_delay=30 yields TypeError: can't pickle _thread.RLock objects

故事扮演 提交于 2021-02-10 18:13:06

问题


As the titles says; in Airflow 1.9.0 if you use the retry_delay=30 (or any other number) parameter with the ExternalTaskSensor, the DAG will run just fine, until you want to clear the task instances in the airflow GUI -> it will return the following error: "TypeError: can't pickle _thread.RLock objects" (and a nice Oops message) But if you use retry_delay=timedelta(seconds=30) clearing task instances works fine.

If I look through the models.py method, the deepcopy should go fine, so it seems like weird behavior to me. Am I missing something, or is this a bug?

Below you can find a minimal DAG example.

from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.sensors import ExternalTaskSensor
from datetime import datetime, timedelta


dag_name = 'soft_fail_example'
schedule_interval = "0 * * * *"
default_args = {
            'owner': 'airflow',
            'depends_on_past': False,
            'start_date': datetime(2018, 1, 1),
            'email': [],
            'email_on_failure': False,
            'email_on_retry': False
        }

test_dag = DAG(dag_name, default_args=default_args, schedule_interval=schedule_interval, 
catchup=False, max_active_runs=1)


ets = ExternalTaskSensor(task_id="test_external_task_sensor", dag=test_dag, soft_fail=False, 
timeout=10, retries=0, poke_interval=1, retry_delay=30, external_dag_id="dependent_dag_id",
                         external_task_id="dependent_task_id")

dummy_task = DummyOperator(task_id="collection_task", dag=test_dag)

dummy_task << ets

Edit: And as requested the stacktrace:

    Ooops.

                              ____/ (  (    )   )  \___
                             /( (  (  )   _    ))  )   )\
                           ((     (   )(    )  )   (   )  )
                         ((/  ( _(   )   (   _) ) (  () )  )
                        ( (  ( (_)   ((    (   )  .((_ ) .  )_
                       ( (  )    (      (  )    )   ) . ) (   )
                      (  (   (  (   ) (  _  ( _) ).  ) . ) ) ( )
                      ( (  (   ) (  )   (  ))     ) _)(   )  )  )
                     ( (  ( \ ) (    (_  ( ) ( )  )   ) )  )) ( )
                      (  (   (  (   (_ ( ) ( _    )  ) (  )  )   )
                     ( (  ( (  (  )     (_  )  ) )  _)   ) _( ( )
                      ((  (   )(    (     _    )   _) _(_ (  (_ )
                       (_((__(_(__(( ( ( |  ) ) ) )_))__))_)___)
                       ((__)        \\||lll|l||///          \_))
                                (   /(/ (  )  ) )\   )
                              (    ( ( ( | | ) ) )\   )
                               (   /(| / ( )) ) ) )) )
                             (     ( ((((_(|)_)))))     )
                              (      ||\(|(|)|/||     )
                            (        |(||(||)||||        )
                              (     //|/l|||)|\\ \     )
                            (/ / //  /|//||||\\  \ \  \ _)
    -------------------------------------------------------------------------------
    Node: jb-VirtualBox
    -------------------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/app.py", line 1988, in wsgi_app
        response = self.full_dispatch_request()
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/app.py", line 1641, in full_dispatch_request
        rv = self.handle_user_exception(e)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/app.py", line 1544, in handle_user_exception
        reraise(exc_type, exc_value, tb)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise
        raise value
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/app.py", line 1639, in full_dispatch_request
        rv = self.dispatch_request()
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask/app.py", line 1625, in dispatch_request
        return self.view_functions[rule.endpoint](**req.view_args)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner
        return self._run_view(f, *args, **kwargs)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view
        return fn(self, *args, **kwargs)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/flask_login.py", line 755, in decorated_view
        return func(*args, **kwargs)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/www/utils.py", line 262, in wrapper
        return f(*args, **kwargs)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/www/utils.py", line 309, in wrapper
        return f(*args, **kwargs)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/www/views.py", line 989, in clear
        include_upstream=upstream)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/models.py", line 3527, in sub_dag
        dag = copy.deepcopy(self)
      File "/usr/lib/python3.6/copy.py", line 161, in deepcopy
        y = copier(memo)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/models.py", line 3512, in __deepcopy__
        setattr(result, k, copy.deepcopy(v, memo))
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
        y[deepcopy(key, memo)] = deepcopy(value, memo)
      File "/usr/lib/python3.6/copy.py", line 161, in deepcopy
        y = copier(memo)
      File "/home/jb/Documents/p3_cdc_data_flow/lib/python3.6/site-packages/airflow/models.py", line 2437, in __deepcopy__
        setattr(result, k, copy.deepcopy(v, memo))
      File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
        y = _reconstruct(x, memo, *rv)
      File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
        state = deepcopy(state, memo)
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
        y[deepcopy(key, memo)] = deepcopy(value, memo)
      File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
        y = _reconstruct(x, memo, *rv)
      File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
        state = deepcopy(state, memo)
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
        y[deepcopy(key, memo)] = deepcopy(value, memo)
      File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
        y = _reconstruct(x, memo, *rv)
      File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
        state = deepcopy(state, memo)
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
        y[deepcopy(key, memo)] = deepcopy(value, memo)
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 215, in _deepcopy_list
        append(deepcopy(a, memo))
      File "/usr/lib/python3.6/copy.py", line 180, in deepcopy
        y = _reconstruct(x, memo, *rv)
      File "/usr/lib/python3.6/copy.py", line 280, in _reconstruct
        state = deepcopy(state, memo)
      File "/usr/lib/python3.6/copy.py", line 150, in deepcopy
        y = copier(x, memo)
      File "/usr/lib/python3.6/copy.py", line 240, in _deepcopy_dict
        y[deepcopy(key, memo)] = deepcopy(value, memo)
      File "/usr/lib/python3.6/copy.py", line 169, in deepcopy
        rv = reductor(4)
    TypeError: can't pickle _thread.RLock objects

回答1:


After looking at this problem again, the documentation clearly states that retry_delay should be a timedelta. So its just lucky that the DAG works if you enter an integer instead of a timedelta for retry_delay.

In models.py, BaseOperator:

   :param retry_delay: delay between retries
   :type retry_delay: timedelta


来源:https://stackoverflow.com/questions/48933382/airflow-1-9-0-externaltasksensor-retry-delay-30-yields-typeerror-cant-pickle

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!