Why does classifier.predict() method expects the number of features in the test data to be the same as in training data?
I am trying to build a simple SVM document classifier using scikit-learn and I am using the following code : import os import numpy as np import scipy.sparse as sp from sklearn.metrics import accuracy_score from sklearn import svm from sklearn.metrics import classification_report from sklearn.feature_extraction.text import CountVectorizer from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import TfidfVectorizer from sklearn import cross_validation from sklearn.datasets import load_svmlight_file clf=svm.SVC() path="C:\\Python27" f1=[] f2=[] data2=[