AWS Glue - Truncate destination postgres table prior to insert

浪尽此生 提交于 2019-11-27 18:26:20

问题


I am trying to truncate a postgres destination table prior to insert, and in general, trying to fire external functions utilizing the connections already created in GLUE.

Has anyone been able to do so?


回答1:


I've tried the DROP/ TRUNCATE scenario, but have not been able to do it with connections already created in Glue, but with a pure Python PostgreSQL driver, pg8000.

  1. Download the tar of pg8000 from pypi
  2. Create an empty __init__.py in the root folder
  3. Zip up the contents & upload to S3
  4. Reference the zip file in the Python lib path of the job
  5. Set the DB connection details as job params (make sure to prepend all key names with --). Tick the "Server-side encryption" box.

Then you can simply create a connection and execute SQL.

import sys
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame
from awsglue.job import Job

import pg8000

args = getResolvedOptions(sys.argv, [
    'JOB_NAME',
    'PW',
    'HOST',
    'USER',
    'DB'
])
# ...
# Create Spark & Glue context

job = Job(glueContext)
job.init(args['JOB_NAME'], args)

# ...
config_port = 5432
conn = pg8000.connect(
    database=args['DB'], 
    user=args['USER'], 
    password=args['PW'],
    host=args['HOST'],
    port=config_port
)
query = "TRUNCATE TABLE {0};".format(".".join([schema, table]))
cur = conn.cursor()
cur.execute(query)
conn.commit()
cur.close()
conn.close()



回答2:


After following step (4) of @thenaturalist's response,

sc.addPyFile("/home/glue/downloads/python/pg8000.zip")

import pg8000

worked for me in a development endpoint (zeppelin notebook)

More info: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-libraries.html



来源:https://stackoverflow.com/questions/47081088/aws-glue-truncate-destination-postgres-table-prior-to-insert

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!