Create single row dataframe from list of list PySpark

后端 未结 3 902
傲寒
傲寒 2020-11-27 07:46

I have a data like this data = [[1.1, 1.2], [1.3, 1.4], [1.5, 1.6]] I want to create a PySpark dataframe

I already use

dataframe = SQLCo         


        
3条回答
  •  庸人自扰
    2020-11-27 08:27

    You should use the Vector Assembler function, from your code I guess you are doing this to train a machine learning model, and vector assembler works the best for that case. You can also add the assembler in the pipeline.

    assemble_feature=VectorAssembler(inputCol=data.columns,outputCol='features')
    pipeline=Pipeline(stages=[assemble_feature])
    pipeline.fit(data).transform(data)
    

提交回复
热议问题