How to transform a csv string into a Spark-ML compatible Dataset<Row> format?
问题 I have a Dataset<Row> df , that contains two columns ("key" and "value") of type string . df.printSchema(); is giving me the following output: root |-- key: string (nullable = true) |-- value: string (nullable = true) The content of the value column is actually a csv formated line (coming from a kafka topic), with the last entry of that line representing the class label and all the previous entries beeing the features (first row not included in the dataset): feature0,feature1,label 0