Converting a Document Term Matrix into a Matrix with lots of data causes overflow

前端 未结 3 1841
孤独总比滥情好
孤独总比滥情好 2020-12-29 09:12

Let\'s do some Text Mining

Here I stand with a document term matrix (from the tm Package)

dtm <- TermDocumentMatrix(
     myCorpus,
          


        
3条回答
  •  青春惊慌失措
    2020-12-29 09:29

    Based on Joris Meys answer, I've found the solution. "vector()" documentation regarding "length" argument

    ... For a long vector, i.e., length > .Machine$integer.max, it has to be of type "double"...

    So we can make a tiny fix of the as.matrix():

    as.big.matrix <- function(x) {
      nr <- x$nrow
      nc <- x$ncol
      # nr and nc are integers. 1 is double. Double * integer -> double
      y <- matrix(vector(typeof(x$v), 1 * nr * nc), nr, nc)
      y[cbind(x$i, x$j)] <- x$v
      dimnames(y) <- x$dimnames
      y
    }
    

提交回复
热议问题