I am trying to implement the SelectKBest algorithm on my data to get the best features out of it. For this I am first preprocessing my data using DictVectorizer and the data
The problem was toarray().
DictVetorizer from sklearn (which is designed for vectorizing categorical features with high cardinality) outputs sparse matrices by default.
You are running out of memory because you require the dense representation by calling fit_transform().toarray().
Just use:
quote_data = DV.fit_transform(quote_data)