Overwrite MySQL tables with AWS Glue

后端 未结 3 1016
时光取名叫无心
时光取名叫无心 2021-01-12 17:38

I have a lambda process which occasionally polls an API for recent data. This data has unique keys, and I\'d like to use Glue to update the table in MySQL. Is there an optio

3条回答
  •  忘掉有多难
    2021-01-12 18:39

    The workaround I've come up with, which is a little simpler than the alternative posted, is the following:

    • Create a staging table in mysql, and load your new data into this table.
    • Run the command: REPLACE INTO myTable SELECT * FROM myStagingTable;
    • Truncate the staging table

    This can be done with:

    import sys from awsglue.transforms
    import * from awsglue.utils import getResolvedOptions
    from pyspark.context import SparkContext
    from awsglue.context import GlueContext
    from awsglue.job import Job
    
    ## @params: [JOB_NAME]
    args = getResolvedOptions(sys.argv, ['JOB_NAME'])
    
    sc = SparkContext()
    glueContext = GlueContext(sc)
    spark = glueContext.spark_session
    job = Job(glueContext)
    job.init(args['JOB_NAME'], args)
    
    import pymysql
    pymysql.install_as_MySQLdb()
    import MySQLdb
    db = MySQLdb.connect("URL", "USERNAME", "PASSWORD", "DATABASE")
    cursor = db.cursor()
    cursor.execute("REPLACE INTO myTable SELECT * FROM myStagingTable")
    cursor.fetchall()
    
    db.close()
    job.commit()
    

提交回复
热议问题