How to connect to Amazon Redshift or other DB's in Apache Spark?

后端未结

关注

 6  2098

刺人心 2021-01-13 09:44

I\'m trying to connect to Amazon Redshift via Spark, so I can join data we have on S3 with data on our RS cluster. I found some very spartan documentation here for the capab

6条回答

青春惊慌失措 (楼主)

2021-01-13 10:07
It turns out you only need a username/pwd to access Redshift in Spark, and it is done as follows (using the Python API):
```
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
df = sqlContext.read.load(source="jdbc", 
                     url="jdbc:postgresql://host:port/dbserver?user=yourusername&password=secret", 
                     dbtable="schema.table"
)
```
Hope this helps someone!
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...