可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

The tm package extends c so that, if given a set of PlainTextDocuments it automatically creates a Corpus. Unfortunately, it appears that each PlainTextDocument must be specified separately.

e.g. if I had:

foolist

I'd do this to get a Corpus:

foocorpus

I have a list of lists of 'PlainTextDocuments that looks like this:

> str(sectioned) List of 154  $ :List of 6   ..$ :Classes 'PlainTextDocument', 'TextDocument', 'character'  atomic [1:1] Developing assessment models   Developing models   .. .. ..- attr(*, "Author")= chr "John Smith"   .. .. ..- attr(*, "DateTimeStamp")= POSIXlt[1:1], format: "2013-04-30 12:03:49"   .. .. ..- attr(*, "Description")= chr(0)    .. .. ..- attr(*, "Heading")= chr "Research Focus"   .. .. ..- attr(*, "ID")= chr(0)    .. .. ..- attr(*, "Language")= chr(0)    .. .. ..- attr(*, "LocalMetaData")=List of 4   .. .. .. ..$ foo           : chr "bar"   .. .. .. ..$ classification: chr "Technician"   .. .. .. ..$ team          : chr ""   .. .. .. ..$ supervisor    : chr "Bill Jones"   .. .. ..- attr(*, "Origin")= chr "Smith-John_e.txt"  #etc., all sublists have 6 elements

So, to get all my PlainTextDocuments into a Corpus, this would work:

sectioned.Corpus

Can anyone suggest an easier way, please?

ETA: foo produces a flat list of PlainTextDocuments, which still leaves me with the problem of feeding a list element by element to c

回答1:

I expect that unlist(foolist) will help you. It has an option recursive which is TRUE by default. So unlist(foolist, recursive=FALSE) will return the list of the documents, and then you can combine them by

do.call(c, unlist(foolist, recursive=FALSE))

do.call just applies the function c to the elements of the obtained list

回答2:

Here's a more general solution for when lists are nested multiple times and the amount of nesting differs between elements of the lists:

 flattenlist

文章来源: how to flatten a list of lists in R

标签

chr