designing classification problem of weather data

梦想与她 提交于 2019-12-08 05:53:25

问题


In normal 2 or multi class classification problem, we can use any famous machine learning algorithm like Naive Bayes or SVM to train and test the model. My problem is that I have been given weather data where the label variable is in the format of "20 % rain, 80 % dry" or "30% cloudy, 70% rain" etc. How should I approach this problem? Will I need to covert the problem into regression somehow? In that case, if there are three labels (rain, dry, cloudy) in data, what may be the right approach to convert percentage information to continuous values? Thanks for your time


回答1:


Assuming that the expressions "20 % rain, 80 % dry" and "30% cloudy, 70% rain" represent probabilities, that the classes are mutually exclusive and that we may ignore a possible ordinal relationship (such as "dry > cloudy > rain") among them, models such as polychotomous logistic regression may be fit to these values, as though they were grouped or replicated.

I suppose other, ad hoc procedures could be employed as well, which would minimize, for example, the Kullback–Leibler divergence.




回答2:


I would recommend a neural network with three outputs labels Rain, Dry, Cloud.

If you have data with label "20 % rain" then weight of instance will be 0.2. If the are no "rain" label should contain "false". Other approach is to 3 different regression classifier with same converting convention. I think regression would work better.

Neural networks will be good choice because it can do all three regression/classification at once and they can influence on each other. Additionally the training algorithm is straightforward.



来源:https://stackoverflow.com/questions/5055112/designing-classification-problem-of-weather-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!