Google Composer- How do I install Microsoft SQL Server ODBC drivers on environments

后端未结

关注

 4  1912

灰色年华 2021-01-16 09:37

I am new to GCP and Airflow and am trying to run my python pipelines via a simple PYODBC connection via python 3. However, I believe I have found what I need to install on t

4条回答

刺人心 (楼主)

2021-01-16 10:23

I was facing the same problem. The first solution which worked for me was building a docker image that would install the drivers and then run the code. Initially I tried to find a way of installing the drivers on the cluster but after many failures I read in documentation that the airflow image in composer is curated by Google and no changes affecting the image are allowable. So here is my docker file:

FROM python:3.7-slim-buster
#FROM gcr.io/data-development-254912/gcp_bi_baseimage 
#FROM gcp_bi_baseimage
LABEL maintainer = " " 
ENV APP_HOME /app 
WORKDIR $APP_HOME
COPY / ./
# install nano 
RUN apt-get update \
    && apt-get install --yes --no-install-recommends \
        apt-utils \
        apt-transport-https \
        curl \
        gnupg \
        unixodbc-dev \ 
        gcc \
        g++ \ 
        nano \
    && curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \
    && curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list \
    && apt-get update \
    && ACCEPT_EULA=Y apt-get install --yes --no-install-recommends msodbcsql17 \
    && apt-get install libgssapi-krb5-2 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/* \
    && rm -rf /tmp/*
 RUN pip install -r requirements.txt
 CMD ["python","app.py"]

requirements.txt:

pyodbc==4.0.28
google-cloud-bigquery==1.24.0    
google-cloud-storage==1.26.0

You should be good from this point on.

Since then I managed to set up an Airflow named connection to our sql server and am using mssql_operator or mssql_hook. I had worked with a cloud engineer to set up the networking just right. What I found is that the named connection is much easier to use, yet kubernetesPodOperator is still much more reliable.

0 讨论(0)

查看其它4个回答