“resolved attribute(s) missing” when performing join on pySpark

こ雲淡風輕ζ 提交于 2019-12-03 05:43:45

You can perform join this way in pyspark, Please see if this is useful for you:

df_lag_pre.alias("df1")
df_unmatched.alias("df2")
join_both = df1.join(df2, (col("df1.name") == col("df2.name")) & (col("df1.country") == col("df2.country")) & (col("df1.ccy_code") == col("df2.ccy_code")) & (col("df1.usd_price") == col("df2.usd_price")), 'inner')

Update: If you are getting col not defined error, please use below import

from pyspark.sql.functions import col
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!