Extract Embedded AWS Glue Connection Credentials Using Scala

我的未来我决定 提交于 2021-01-29 14:17:51

问题


I have a glue job that reads directly from redshift, and to do that, one has to provide connection credentials. I have created an embedded glue connection and can extract the credentials with the following pyspark code. Is there a way to do this in Scala?

glue = boto3.client('glue', region_name='us-east-1')
    
response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)
    
table = spark.read.format(
    'com.databricks.spark.redshift'
).option(
    'url',
    'jdbc:redshift://prod.us-east-1.redshift.amazonaws.com:5439/db'
).option(
    'user',
    response['Connection']['ConnectionProperties']['USERNAME']
).option(
    'password',
    response['Connection']['ConnectionProperties']['PASSWORD']
).option(
    'dbtable',
    'db.table'
).option(
    'tempdir',
    's3://config/glue/temp/redshift/'
).option(
    'forward_spark_s3_credentials', 'true'
).load()

回答1:


There is no scala equivalent from AWS to issue this API call.But you can use Java SDK code inside scala as mentioned in this answer.

This is the Java SDK call for getConnection and if you don't want to do this then you can follow below approach:

  1. Create AWS Glue python shell job and retrieve the connection information.

  2. Once you have the values then call the other scala Glue job with these as arguments inside your python shell job as shown below :

glue = boto3.client('glue', region_name='us-east-1')

response = glue.get_connection(
    Name='name-of-embedded-connection',
    HidePassword=False 
)

response = client.start_job_run(
               JobName = 'my_scala_Job',
               Arguments = {
                 '--username': response['Connection']['ConnectionProperties']['USERNAME'],
                 '--password': response['Connection']['ConnectionProperties']['PASSWORD'] } )
  1. Then access these parameters inside your scala job using getResolvedOptions as shown below:

import com.amazonaws.services.glue.util.GlueArgParser

val args = GlueArgParser.getResolvedOptions(
  sysArgs, Array(
    "username",
    "password")
)
val user = args("username")
val pwd  = args("password")


来源:https://stackoverflow.com/questions/63385665/extract-embedded-aws-glue-connection-credentials-using-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!