Importing pdf in R through package “tm”

柔情痞子 提交于 2019-12-06 14:57:41

问题


I know the practical example to get pdf in "R" workspace through package "tm" but not able to understand how the code is working and thus not able to import the desired pdf. The pdf imported in the following code is "tm" vignette.

The code is

if(file.exists(Sys.which("pdftotext"))) {
    pdf <- readPDF(PdftotextOptions = "-layout")(elem = list(uri = vignette("tm")$pdf),
                                                 language = "en",
                                                 id = "id1")
    pdf[1:13]
}

The "tm" is vignette. While the pdf which I am trying to bring is "different". So how to change the above code to bring my pdf in the workspace. minn is the pdf document which I am trying to import.

like

if(file.exists(Sys.which("pdftotext"))) {
        pdf <- readPDF(PdftotextOptions = "-layout")(elem = list(uri = vignette("minn")$pdf),
                                                     language = "en",
                                                     id = "id1")
        pdf[1:13]
    }

回答1:


So it seems that problem is with the PDF which I was trying to read. However the code goes like the below. Thanks Thomas for the lead. The link for pdf is "http://www.wine-economics.org/workingpapers/AAWE_WP16.pdf"

tt <- readPDF(PdftotextOptions="-layout")
rr <- tt(elem=list(uri="AAWE_WP16.pdf"),language="en",id="id1")
rr[1:15]


来源:https://stackoverflow.com/questions/17415239/importing-pdf-in-r-through-package-tm

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!