问题
Globally, I'm interested in getting all text data from R documentations to put them in data frames and apply text mining techniques.
- PACKAGE LEVEL: Suppose I'm interested in a package, for instance "utils" and I want to get all text data in a vector. This works:
package_d <- packageDescription("utils")
package_d$Description
But not this :
package_d$Details
FUNCTIONS LEVEL : Same problem but for the functions. I tried this without success:
function_d <- ?utils::adist function_d$Description
SUB-LEVELS : I would like to extract all the details, descriptions of arguments and values of the functions of a particular package...
Thank you very much for your help !
回答1:
I couldn't find a built in one, but looking at the source for the functions that do most of the work, here's a function that can extract the text from the help page.
help_text <- function(...) {
file <- help(...)
path <- dirname(file)
dirpath <- dirname(path)
pkgname <- basename(dirpath)
RdDB <- file.path(path, pkgname)
rd <- tools:::fetchRdDB(RdDB, basename(file))
capture.output(tools::Rd2txt(rd, out="", options=list(underline_titles=FALSE)))
}
You can use it with the package help pages and function help pages.
h1 <- help_text(utils)
h2 <- help_text(adist)
You'll get an array of rows from the help page. You can print them with
cat(h1, sep="\n")
来源:https://stackoverflow.com/questions/51330090/how-to-get-text-data-from-help-pages-in-r