How to merge multiple feature vectors in DataFrame?

后端 未结 1 1756
广开言路
广开言路 2020-12-08 21:25

Using Spark ML transformers I arrived at a DataFrame where each row looks like this:

Row(object_id, text_features_vector, color_features, type_f         


        
相关标签:
1条回答
  • 2020-12-08 21:43

    You can use VectorAssembler:

    import org.apache.spark.ml.feature.VectorAssembler
    import org.apache.spark.sql.DataFrame
    
    val df: DataFrame = ???
    
    val assembler = new VectorAssembler()
      .setInputCols(Array("text_features", "color_features", "type_features"))
      .setOutputCol("features")
    
    val transformed = assembler.transform(df)
    

    For PySpark example see: Encode and assemble multiple features in PySpark

    0 讨论(0)
提交回复
热议问题