I would like to compute the recall, precision and f-measure of a cross validation test for different classifiers. scik
The solution you present represents exactly the functionality of cross_val_score, perfectly adapted to your situation. It seems like the right way to go.
cross_val_score takes the argument n_jobs=, making the evaluation parallelizeable. If this is something you need, you should look into replacing your for loop with a parallel loop, using sklearn.externals.joblib.Parallel.
On a more general note, a discussion is going on about the problem of multiple scores in the issue tracker of scikit learn. A representative thread can be found here. So while it looks like future versions of scikit-learn will permit multiple outputs of scorers, as of now, this is impossible.
A hacky (disclaimer!) way to get around this is to change the code in cross_validation.py ever so slightly, by removing a condition check on whether your score is a number. However, this suggestion is very version dependent, so I will present it for version 0.14.
1) In IPython, type from sklearn import cross_validation, followed by cross_validation??. Note the filename that is displayed and open it in an editor (you may need root priviliges).
2) You will find this code, where I have already tagged the relevant line (1066). It says
if not isinstance(score, numbers.Number):
raise ValueError("scoring must return a number, got %s (%s)"
" instead." % (str(score), type(score)))
These lines need to be removed. In order to keep track of what was there once (if ever you want to change back), replace it with the following
if not isinstance(score, numbers.Number):
pass
# raise ValueError("scoring must return a number, got %s (%s)"
# " instead." % (str(score), type(score)))
If what your scorer returns doesn't make cross_val_score choke elsewhere, this should resolve your issue. Please let me know if this is the case.