List and description of all packages in CRAN from within R

后端 未结 3 1980
隐瞒了意图╮
隐瞒了意图╮ 2020-12-18 00:05

I can get a list of all the available packages with the function:

ap <- available.packages()

But how can I also get a description of the

3条回答
  •  挽巷
    挽巷 (楼主)
    2020-12-18 01:03

    I wanted to try to do this using a HTML scraper (rvest) as an exercise, since the available.packages() in OP doesn't contain the package Descriptions.

    library('rvest')
    url <- 'https://cloud.r-project.org/web/packages/available_packages_by_name.html'
    webpage <- read_html(url)
    data_html <- html_nodes(webpage,'tr td')
    length(data_html)
    
    P1 <- html_nodes(webpage,'td:nth-child(1)') %>% html_text(trim=TRUE)  # XML: The Package Name
    P2 <- html_nodes(webpage,'td:nth-child(2)') %>% html_text(trim=TRUE)  # XML: The Description
    P1 <- P1[lengths(P1) > 0 & P1 != ""]  # Remove NULL and empty ("") items
    length(P1); length(P2);
    
    mdf <- data.frame(P1, P2, row.names=NULL)
    colnames(mdf) <- c("PackageName", "Description")
    
    # This is the problem! It lists large sets column-by-column,
    # instead of row-by-row. Try with the full list to see what happens.
    print(mdf, right=FALSE, row.names=FALSE)
    
    # PackageName Description                                                             
    # A3          Accurate, Adaptable, and Accessible Error Metrics for Predictive\nModels
    # abbyyR      Access to Abbyy Optical Character Recognition (OCR) API                 
    # abc         Tools for Approximate Bayesian Computation (ABC)                        
    # abc.data    Data Only: Tools for Approximate Bayesian Computation (ABC)             
    # ABC.RAP     Array Based CpG Region Analysis Pipeline                                
    # ABCanalysis Computed ABC Analysis
    
    # For small sets we can use either:
    # mdf[1:6,] #or# head(mdf, 6)
    

    However, although working quite well for small array/dataframe list (subset), I ran into a display problem with the full list, where the data would be shown either column-by-column or unaligned. I would have been great to have this paged and properly formatted in a new window somehow. I tried using page, but I couldn't get it to work very well.


    EDIT: The recommended method is not the above, but rather using Dirk's suggestion (from the comments below):

    db <- tools::CRAN_package_db()
    colnames(db)
    mdf <- data.frame(db[,1], db[,52])
    colnames(mdf) <- c("Package", "Description")
    print(mdf, right=FALSE, row.names=FALSE)
    

    However, this still suffers from the display problem mentioned...

提交回复
热议问题