airflow

AssertionError: INTERNAL: No default project is specified

≯℡__Kan透↙ 提交于 2019-12-21 17:55:10
问题 New to airflow. Trying to run the sql and store the result in a BigQuery table. Getting following error. Not sure where to setup the default_rpoject_id. Please help me. Error: Traceback (most recent call last): File "/usr/local/bin/airflow", line 28, in <module> args.func(args) File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 585, in test ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True) File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py",

AssertionError: INTERNAL: No default project is specified

谁说胖子不能爱 提交于 2019-12-21 17:53:10
问题 New to airflow. Trying to run the sql and store the result in a BigQuery table. Getting following error. Not sure where to setup the default_rpoject_id. Please help me. Error: Traceback (most recent call last): File "/usr/local/bin/airflow", line 28, in <module> args.func(args) File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 585, in test ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True) File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py",

Airflow Jinja Rendered Template

限于喜欢 提交于 2019-12-21 13:40:54
问题 I've been able to successfully render Jinja Templates using the function within the BaseOperator, render_template . My question is does anyone know the requirements to get rendered strings into the UI under the Rendered or Rendered Template tab? Referring to this tab in the UI: Any help or guidance here would be appreciated. 回答1: If you are using templated fields in an Operator, the created strings out of the templated fields will be shown there. E.g. with a BashOperator: example_task =

AWS Lambda and Apache Airflow integration

半城伤御伤魂 提交于 2019-12-21 10:14:11
问题 wondered if anyone could shed some light on this issue: I'm trying to locate the Airflow REST API URL to initiate a DAG to Run from AWS Lambda Function. So far from looking at all the relevant documentation provided from the Apache Incubator Site, the only guidance to solved the problem is by using this URL structure in the Lambda (python 3.6) Code: Apache Experimental API: https://airflow.apache.org/api.html#endpoints based on that link the syntax should read: http://airflow_hostname/api

Apache Airflow: Control over logging [Disable/Adjust logging level]

可紊 提交于 2019-12-21 08:23:17
问题 I am using Airflow 1.7.1.3 installed using pip I would like to limit the logging to ERROR level for the workflow being executed by the scheduler. Could not find anything beyond setting log files location in the settings.py file. Also the online resources led me to this google group discussion here but not much info here as well Any idea how to control logging in Airflow? 回答1: The logging functionality and its configuration will be changed in version 1.9 with this commit 回答2: I tried below

While airflow initdb, ImportError: cannot import name HiveOperator

怎甘沉沦 提交于 2019-12-21 07:03:24
问题 I have recently installed airflow for my workflows. While creating my project, I executed following command: airflow initdb which returned following error: [2016-08-15 11:17:00,314] {__init__.py:36} INFO - Using executor SequentialExecutor DB: sqlite:////Users/mikhilraj/airflow/airflow.db [2016-08-15 11:17:01,319] {db.py:222} INFO - Creating tables INFO [alembic.runtime.migration] Context impl SQLiteImpl. INFO [alembic.runtime.migration] Will assume non-transactional DDL. ERROR [airflow

Removing Airflow task logs

浪尽此生 提交于 2019-12-21 06:51:31
问题 I'm running 5 DAG's which have generated a total of about 6GB of log data in the base_log_folder over a months period. I just added a remote_base_log_folder but it seems it does not exclude logging to the base_log_folder . Is there anyway to automatically remove old log files, rotate them or force airflow to not log on disk (base_log_folder) only in remote storage? 回答1: Please refer https://github.com/teamclairvoyant/airflow-maintenance-dags This plugin has DAGs that can kill halted tasks and

Incorrect work of scheduler interval and start time in Apache Airflow

微笑、不失礼 提交于 2019-12-21 06:26:13
问题 Can't find the solution with start time of tasks. I have code and can't find where I'm wrong. When I`ve run DAG, 25.03, 26.03, 27.03. tasks were completed, but today(28.03) tasks not started in 6:48. I have tried to use cron expressions, pendulum, datetime and result is the same. Local time(UTC+3) and airflow's time(UTC) is different. I've tried to use each time(local, airflow) in 'start date' or 'schedule interval' - no result. Using: Ubuntu, Airflow v. 1.9.0 and local executor. emailname =

Incorrect work of scheduler interval and start time in Apache Airflow

落爺英雄遲暮 提交于 2019-12-21 06:26:07
问题 Can't find the solution with start time of tasks. I have code and can't find where I'm wrong. When I`ve run DAG, 25.03, 26.03, 27.03. tasks were completed, but today(28.03) tasks not started in 6:48. I have tried to use cron expressions, pendulum, datetime and result is the same. Local time(UTC+3) and airflow's time(UTC) is different. I've tried to use each time(local, airflow) in 'start date' or 'schedule interval' - no result. Using: Ubuntu, Airflow v. 1.9.0 and local executor. emailname =

Airflow BashOperator doesn't work but PythonOperator does

自作多情 提交于 2019-12-21 05:15:41
问题 I seem to have a problem with BashOperator . I'm using Airflow 1.10 installed on CentOS in a Miniconda environment (Python 3.6) using the package on Conda Forge. When I run airflow test tutorial pyHi 2018-01-01 the output is "Hello world!" as expected. However, when I run airflow test tutorial print_date 2018-01-01 or airflow test tutorial templated 2018-01-01 nothing happens. This is the Linux shell output: (etl) [root@VIRT02 airflow]# airflow test tutorial sleep 2015-06-01 [2018-09-28 19:56