How to reproduce the default sklearn CountVectorizer tokenization with only regex?

前端 未结 0 714
无人共我
无人共我 2020-12-11 10:11

I don\'t want to use CountVectorizer but try to reproduce it\'s way of tokenizing. I know it removes some special characters and puts them in lowercase. I tried this regex r

相关标签:
回答
  • 消灭零回复
提交回复
热议问题