How to make TF-IDF matrix dense?

后端未结

关注

 1  1609

I am using TfidfVectorizer to convert a collection of raw documents to a matrix of TF-IDF features, which I then plan to input into a k-means algorithm (which I will impleme

相关标签:

1条回答

滥情空心

2020-12-18 22:46
This should be as simple as:
```
dense = X.toarray()
```
TfIdfVectorizer.fit_transform() is returning a SciPy csr_matrix() (Compressed Sparse Row Matrix), which has a toarray() method just for this purpose. There are several formats of sparse matrices in SciPy, but they all have a .toarray() method.

Note that for a large matrix, this will use a tremendous amount of memory compared to a sparse matrix, so generally it's a good approach to leave it sparse for as long as possible.
0 讨论(0)
发布评论:

提交评论
- 加载中...