问题
We are using airflow as a scheduler. I want to invoke a simple bash operator in a DAG. The bash script needs password as an argument to do further processing.
How can I store a password securely in airflow (config/variables/connection) and access it in dag definition file.
I am new to airflow and Python so a code snippet will be appreciated.
回答1:
You can store the password in a Hook - this will be encrypted so long as you have setup your fernet key.
Here is how you can create a connection.
from airflow.models import Connection
def create_conn(username, password, host=None):
new_conn = Connection(conn_id=f'{username}_connection',
login=username,
host=host if host else None)
new_conn.set_password(password)
Then, this password is encryted in the db you setup.
To access this password:
from airflow.hooks.base_hook import BaseHook
connection = BaseHook.get_connection("username_connection")
password = connection.password # This is a getter that returns the unencrypted password.
EDIT:
There is an easier way to create a connection via the UI:
Then:
回答2:
You can store the password in airflow variables, https://airflow.incubator.apache.org/ui.html#variable-view
- Create a variable with key&value in UI, for example, mypass:XXX
- Import Variable
from airflow.models import Variable
- MyPass = Variable.get("mypass")
- Pass MyPass to your bash script:
command = """
echo "{{ params.my_param }}"
"""
task = BashOperator(
task_id='templated',
bash_command=command,
params={'my_param': MyPass},
dag=dag)
回答3:
In this case I would use a PythonOperator from which you are able to get a Hook
on your database connection using
hook = PostgresHook(postgres_conn_id=postgres_conn_id)
. You can then call get_connection
on this hook which will give you a Connection object from which you can get the host, login and password for your database connection.
Finally, use for example subprocess.call(your_script.sh, connection_string)
passing the connection details as a parameter.
This method is a bit convoluted but it does allow you to keep the encryption for database connections in Airflow. Also, you should be able to pull this strategy into a separate Operator class inheriting the base behaviour from PythonOperator but adding the logic for getting the hook and calling the bash script.
回答4:
This is what I've used.
def add_slack_token(ds, **kwargs):
""""Add a slack token"""
session = settings.Session()
new_conn = Connection(conn_id='slack_token')
new_conn.set_password(SLACK_LEGACY_TOKEN)
if not (session.query(Connection).filter(Connection.conn_id ==
new_conn.conn_id).first()):
session.add(new_conn)
session.commit()
else:
msg = '\n\tA connection with `conn_id`={conn_id} already exists\n'
msg = msg.format(conn_id=new_conn.conn_id)
print(msg)
dag = DAG(
'add_connections',
default_args=default_args,
schedule_interval="@once")
t2 = PythonOperator(
dag=dag,
task_id='add_slack_token',
python_callable=add_slack_token,
provide_context=True,
)
来源:https://stackoverflow.com/questions/45280650/store-and-access-password-using-apache-airflow