Is it possible to add a covariate (control for a variable of no interest) to an SVM model?

不问归期 提交于 2021-02-08 05:48:24

问题


I'm very new to machine learning and python and I'm trying to build a model to predict patients (N=200) vs controls (N=200) form structural neuroimaging data. After the initial preprocessing were I reshaped the neuroimaging data into a 2D array I built the following model:

from sklearn.svm import SVC
svc = SVC(C=1.0, kernel='linear')


from sklearn.grid_search import GridSearchCV
from numpy import range
k_range = np.arange(0.1,10,0.1)
param_grid=dict(C=k_range)
grid=GridSearchCV(svc, param_grid, cv=10, scoring='accuracy')
grid.fit(img,labels)
grid.grid_scores_
print grid.best_score_
print grid.best_params_

This gives me a decent a result but I'd like to control for the fact that different images were acquired with different scanners (e.g. subjects 1 through 150 were scanned with scanner 1, subjects 101 through 300 were scanned with scanner 2 and subjects 301 through 400 were scanned with scanner 3). Is there anyway this could be added to the model above?

I read that doing a previous feature selection might help. However, I don't want to simply extract meaningful features when those features might be related to the scanner. In fact, I want to classify patients and controls NOT based on the scanner (i.e. controlling for scanner).

Any thoughts on this would be appreciated, thank you


回答1:


For diagnostics, you could take a look at how your data is distributed per scanner to see whether this direction you're pursuing is promising. Normalization (e.g., of mean+variance per scanner) can be one option as someone already suggested. Another option is adding 3 additional dimensions to your feature set as a one-hot encoding for the scanner used (i.e., for each example, you have a 1 in the position of the appropriate scanner and 0 for others)




回答2:


To add it to your model, you can your your normalization parameter for each scanner as a feature and include it in your model.



来源:https://stackoverflow.com/questions/37277647/is-it-possible-to-add-a-covariate-control-for-a-variable-of-no-interest-to-an

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!