ML .net reading in tab seperated data set

大兔子大兔子 提交于 2020-01-14 04:26:23

问题


Its my first time using ML.net 0.8 and I'm having troubling loading in my dataset.

var mlContext = new MLContext();

        String dataPath = "ML Data 3.txt";
        var trainingDataView = mlContext.Data.ReadFromTextFile(
            columns: new TextLoader.Column[]
            {
                new TextLoader.Column("Product", DataKind.Text,0),
                new TextLoader.Column("Streat", DataKind.R4, 1),
                new TextLoader.Column("Overspray", DataKind.R4,2),
                new TextLoader.Column("MLS",DataKind.R4,3),
                new TextLoader.Column("Moisture",DataKind.R4,4)
            }, path: dataPath );


        var data = trainingDataView.Preview();

        var pipeline = mlContext.Transforms.Concatenate("Features", "Product", "Streat", "Overspray", "MLS")
            .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Moisture", featureColumn: "Features"))
            .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedMoisture"));


        var model = pipeline.Fit(trainingDataView);

The data preview looks good, however when it tries to fit to perform the Fit operation I receive the following error:

System.InvalidOperationException: 'Column 'Streat' has values of R4which is not the same as earlier observed type of Text.'

I have checked the data and there are no Text elements within the data file, other than the Product column.

Any advice greatly received.


回答1:


MulticlassClassification algorithm won't work with text features, only numbers. If Product is some kind of identifier, you'd better to exclude it from Concatenate call since it's not a feature:

mlContext.Transforms.Concatenate("Features", "Streat", "Overspray", "MLS")

If it's some kind of category and should be used as a feature, you can convert it to a number using one of transformations, like OneHotEncoding:

var pipeline = mlContext.Transforms.Categorical.OneHotEncoding("Product")
        .Append(mlContext.Transforms.Concatenate("Features", "Product", "Streat", "Overspray", "MLS"))
        .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Moisture", featureColumn: "Features"))
        .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedMoisture"));


来源:https://stackoverflow.com/questions/53961877/ml-net-reading-in-tab-seperated-data-set

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!