How to select only a subset of corpus terms for TermDocumentMatrix creation in tm

前端未结

关注

 2  613

无人及你 2021-01-22 09:21

I have a huge corpus, and I\'m interested in only appearance of a handful of terms that I know up front. Is there a way to create a term document matrix from the corpus using th

2条回答

甜味超标 (楼主)

2021-01-22 09:58
An another way of filtering a corpus; First assign your value to the meta part, say language; by looping elements of the corpus with the variable i, check whatever you want, then filter by using with these meta attribute.
```
corpusz[[i]]$meta["language"] <- 'tur'

idx <- meta(corpusz, "language") ==  'tur'
filtered <- corpusz[idx]
```
Now filtered containes only the corpus elements we want.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...