问题
I have a sql which is stored in a variable in python and we use SnowFlake database. First I have converted to Pandas Data frame using sql, but I need to convert to Spark Data frame and then store in a CreateorReplaceTempView. I tried:
import pandas as pd
import sf_connectivity (we have a code for establishing connection with Snowflake database)
emp = 'Select * From Employee'
snowflake_connection = sf_connectivity.collector() (It is a method to establish snowflake conenction)
pd_df = pd.read_sql_query(emp, snowflake_connection)
requirement 1: Create SnowFlake Dataframe (sf_df) from Pandas Dataframe (pd_df)
requirement 2: sf_df.createOrReplaceTempView(Temp_Employee)
How can I make this work?
回答1:
Per my comment in the question above, you'd likely be better off just loading the data directly to a Dataframe in Spark using the Snowflake Spark connector. Here is a link to the document that details how to do this:
https://docs.snowflake.com/en/user-guide/spark-connector-use.html#moving-data-from-snowflake-to-spark
来源:https://stackoverflow.com/questions/62177418/how-to-create-a-spark-data-frame-from-pandas-data-frame-using-snow-flake-and-pyt