How to convert docx to PDF in r?

旧时模样 提交于 2019-11-28 06:09:27

问题


I want to ask if it is possible to convert text files such as word document or text document to PDF using R ? I thought of converting it to .rmd and then to PDF using this code

require(rmarkdown)
my_text <- readLines("C:/.../track.txt")
cat(my_text, sep="  \n", file = "my_text.Rmd")
render("my_text.Rmd", pdf_document())

But it doesn't work showing this error:

Error: Failed to compile my_text.tex. In addition: Warning message: running command '"pdflatex" -halt-on-error -interaction=batchmode "my_text.tex"' had status 127

Is there any other solution ?


回答1:


.txt to .pdf

Install wkhtmltopdf and then from R run the following. Change the first three lines as appropriate depending on where wkhtmltopdf is on your system and depending on the input and output file paths and names.

wkhtmltopdf <- "C:\\Program Files\\wkhtmltopdf\\bin\\wkhtmltopdf.exe"
input <- "in.txt"
output <- "out.pdf"
cmd <- sprintf('"%s" "%s" -o "%s"', wkhtmltopdf, input, output)
shell(cmd)

.docx to .pdf

Install pandoc, modify the first three lines below as needed and run. How well this works may vary depending on your input.

pandoc <- "C:\\Program Files (x86)\\Pandoc\\pandoc.exe"
input <- "in.docx"
output <- "out.pdf"
cmd <- sprintf('"%s" "%s" -o "%s"', pandoc, input, output)
shell(cmd)



回答2:


I absolutely have not been able to make the Pandoc method work for me.

I did figure out a way to convert docx to PDF using RDCOMClient, however.

library(RDCOMClient)

file <- "C:/path/to your/doc.docx"

wordApp <- COMCreate("Word.Application")  # create COM object
wordApp[["Visible"]] <- TRUE #opens a Word application instance visibly
wordApp[["Documents"]]$Add() #adds new blank docx in your application
wordApp[["Documents"]]$Open(Filename=file) #opens your docx in wordApp

#THIS IS THE MAGIC    
wordApp[["ActiveDocument"]]$SaveAs("C:/path/to your/new.pdf", 
FileFormat=17) #FileFormat=17 saves as .PDF

wordApp$Quit() #quit wordApp

I found the FileFormat=17 bit here https://docs.microsoft.com/en-us/office/vba/api/word.wdexportformat

Hopefully this helps!




回答3:


.docx to .pdf with libreoffice

As suggested here by JeanVuda, you can also convert .docx to .pdf with libreoffice, assuming you've made an install of libreoffice on your machine.

The following code convert a .docx file to .pdf using libreoffice :

docfile <- "X:/path_to_your_docx/yourdocxfile.docx" 
# Indicate the correct path for the .docx file you want to convert

system(paste("X:/path_to_libreoffice/program/soffice.exe --headless --convert-to pdf", docfile), intern = TRUE)
# Indicate the correct path where libreoffice executable is located on your machine,
# convert .docx to .pdf with libreoffice.

Feedback on libreoffice :

  1. Where my pandoc version fail to convert .docx to a .pdf and RDCOMClient is not available for my version of R, libreoffice provide a fast and direct way to convert word document in multiple format.

  2. Please note that for the .pdf conversion, the tables don't render correctly in the .pdf (but are printed in landscape mode), and the most direct way I can find is to transform my tables in images during the knitting of the word document with kableExtra::as_image(), which is maybe not appropriate for what you need.

  3. There are previous questions about command line converting to others format here, and I guess the original answer in ReporteR discussion which introducing this method for the useRs is that one.

Best regards



来源:https://stackoverflow.com/questions/49113503/how-to-convert-docx-to-pdf-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!