发表新帖

发表新帖

ELKI Kmeans clustering Task failed error for high dimensional data

前端未结

关注

 2  2114

傲寒 2021-01-25 22:35

I have a 60000 documents which i processed in gensim and got a 60000*300 matrix. I exported this as a csv file. When i import this in ELKI

2条回答

情深已故 (楼主)

2021-01-25 23:21
This sounds strange, but i found the solution to this issue by opening the exported CSV file and doing Save As and saving again as a CSV file. While size of the original file is 437MB, the second file is 163MB. I used the numpy function np.savetxt for saving the doc2vec vector. So it seems to be a Python issue instead of being ELKI issue.

Edit: Above solution is not useful. I instead exported the doc2vec output which was created using gensim library and while exporting format of the values were decided explicitly as %1.22e. i.e. the values exported are in exponential format and values have length of 22. Below is the entire line of code.
```
textVect = model.docvecs.doctag_syn0
np.savetxt('D:\Backup\expo22.csv',textVect,delimiter=',',fmt=('%1.22e'))
```
CSV file thus created runs without any issue in ELKI environment.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题