How to get text data from help pages in R?

旧巷老猫 提交于 2019-12-25 03:28:01

问题


Globally, I'm interested in getting all text data from R documentations to put them in data frames and apply text mining techniques.

  1. PACKAGE LEVEL: Suppose I'm interested in a package, for instance "utils" and I want to get all text data in a vector. This works:

package_d <- packageDescription("utils") package_d$Description

But not this : package_d$Details

  1. FUNCTIONS LEVEL : Same problem but for the functions. I tried this without success:

    function_d <- ?utils::adist function_d$Description

  2. SUB-LEVELS : I would like to extract all the details, descriptions of arguments and values of the functions of a particular package...

Thank you very much for your help !


回答1:


I couldn't find a built in one, but looking at the source for the functions that do most of the work, here's a function that can extract the text from the help page.

help_text <- function(...) {
  file <- help(...)
  path <- dirname(file)
  dirpath <- dirname(path)
  pkgname <- basename(dirpath)
  RdDB <- file.path(path, pkgname)
  rd <- tools:::fetchRdDB(RdDB, basename(file))
  capture.output(tools::Rd2txt(rd, out="", options=list(underline_titles=FALSE)))
}

You can use it with the package help pages and function help pages.

h1 <- help_text(utils)
h2 <- help_text(adist)

You'll get an array of rows from the help page. You can print them with

cat(h1, sep="\n")


来源:https://stackoverflow.com/questions/51330090/how-to-get-text-data-from-help-pages-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!