I have a dataset for text classification ready to be used in MATLAB. Each document is a vector in this dataset and the dimensionality of this vector is extremely high. In th
You might consider using the independent features technique of Weiss and Kulikowski to quickly eliminate variables which are obviously unimformative:
http://matlabdatamining.blogspot.com/2006/12/feature-selection-phase-1-eliminate.html