发表新帖

发表新帖

How to create correct data frame for classification in Spark ML

前端未结

关注

 3  1838

鱼传尺愫 2020-12-04 10:05

I am trying to run random forest classification by using Spark ML api but I am having issues with creating right data frame input into pipeline.

Here is sample data

3条回答

天涯浪人 (楼主)

2020-12-04 10:32

According to spark documentation on mllib - random trees, seems to me that you should define the features map that you are using and the points should be a labeledpoint.

This will tell the algorithm which column should be used as prediction and which ones are the features.

https://spark.apache.org/docs/latest/mllib-decision-tree.html

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题