TypeError: 'Column' object is not callable using WithColumn

后端 未结 2 1940
广开言路
广开言路 2020-12-17 23:15

I would like append a new column on dataframe \"df\" from function get_distance:

def get_distance(x, y):
         


        
2条回答
  •  抹茶落季
    2020-12-17 23:53

    • You cannot use Python function on a Column objects directly, unless it is intended to operate on Column objects / expressions. You need udf for that:

      @udf
      def get_distance(x, y):
          ...
      
    • But you cannot use SQLContext in udf (or mapper in general).

    • Just join:

      tab = hiveContext.table("tab").groupBy("column1", "column2").agg(first("column3"))
      df.join(tab, ["column1", "column2"])
      

提交回复
热议问题