MemoryError in toarray when using DictVectorizer of Scikit Learn

后端 未结 7 2099
無奈伤痛
無奈伤痛 2021-01-06 05:47

I am trying to implement the SelectKBest algorithm on my data to get the best features out of it. For this I am first preprocessing my data using DictVectorizer and the data

7条回答
  •  遥遥无期
    2021-01-06 06:39

    @Serendipity Using the fit_transform function, I also runned into the memory error. And removing a column was in my case not an option. So I removed .toarray() and the code worked fine.

    I run two tests using a smaller dataset with and without the .toarray() option and in both cases it produced an identical matrix.

    In short, removing .toarray() did the job!

提交回复
热议问题