How to transform a categorical variable in Spark into a set of columns coded as {0,1}?

后端 未结 4 724
终归单人心
终归单人心 2020-12-29 14:42

I\'m trying to perform a logistic regression (LogisticRegressionWithLBFGS) with Spark MLlib (with Scala) on a dataset which contains categorical variables. I discover Spark

4条回答
  •  甜味超标
    2020-12-29 15:07

    A VectorIndexer is coming in Spark 1.4 which might help you with this kind of feature transformation: http://people.apache.org/~pwendell/spark-1.4.0-rc1-docs/api/scala/index.html#org.apache.spark.ml.feature.VectorIndexer

    However it looks like this will only be available in spark.ml rather than mllib

    https://issues.apache.org/jira/browse/SPARK-4081

提交回复
热议问题