Using pyspark to connect to PostgreSQL

前端 未结 10 1403
逝去的感伤
逝去的感伤 2020-12-01 04:50

I am trying to connect to a database with pyspark and I am using the following code:

sqlctx = SQLContext(sc)
df = sqlctx.load(
    url = "jdbc:postgresql         


        
10条回答
  •  情深已故
    2020-12-01 05:19

    Just initialize pyspark with --jars

    E.g.: pyspark --jars /path/Downloads/postgresql-42.2.16.jar

    then create a dataframe as suggested above in other answers

    E.g.:

    df2 = spark.read.format("jdbc").option("url", "jdbc:postgresql://localhost:5432/db").option("dbtable", "yourTableHere").option("user", "postgres").option("password", "postgres").option("driver", "org.postgresql.Driver").load()
    

提交回复
热议问题