发表新帖

发表新帖

tm: read in data frame, keep text id's, construct DTM and join to other dataset

前端未结

关注

 5  1587

半阙折子戏 2020-12-29 11:48

I\'m using package tm.

Say I have a data frame of 2 columns, 500 rows. The first column is ID which is randomly generated and has both character and number in it: \"

5条回答

轻奢々 (楼主)

2020-12-29 12:30
qdap 1.2.0 can do both tasks with little coding, though not a one liner ;-), and not necessarily faster than Ben's (as key_merge is a convenience wrapper for merge). Using all of Ben's data from above (which makes my answer look smaller when it's not that much smaller.
```
## The code
library(qdap)
mycorpus <- with(df, as.Corpus(txt, ID))

mydtm <- as.dtm(Filter(as.wfm(mycorpus, 
     col1 = "docs", col2 = "text", 
     stopwords = tm::stopwords("english")), 3, 10))

key_merge(matrix2df(mydtm, "ID"), df2, "ID")
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题