how to download a large binary file with RCurl *after* server authentication

北战南征 提交于 2019-11-29 06:44:00
Ast Derek
  1. From this link create a file named curl_writer.c and save it to C:\<folder where you save your R files>

    #include <stdio.h>
    
    /**
     * Original code just sent some message to stderr
     */
    size_t writer(void *buffer, size_t size, size_t nmemb, void *stream) {
        fwrite(buffer,size,nmemb,(FILE *)stream);
        return size * nmemb;
    }
    
  2. Open a command window, go to the folder where you saved curl_writer.c and run the R compiler

    c:> cd "C:\<folder where you save your R files>"
    c:> R CMD SHLIB -o curl_writer.dll curl_writer.c
    
  3. Open R and run your script

    C:> R
    
    your.email <- "email@address.com"
    your.password <- "password"
    extract.path <- "https://usa.ipums.org/usa-action/downloads/extract_files/some_file.csv.gz"
    
    library(RCurl)
    
    values <- 
        list(
            "login[email]" = your.email , 
            "login[password]" = your.password , 
            "login[is_for_login]" = 1
        )
    
    curl = getCurlHandle()
    
    curlSetOpt(
        cookiejar = 'cookies.txt', 
        followlocation = TRUE, 
        autoreferer = TRUE, 
        ssl.verifypeer = FALSE,
        curl = curl
    )
    
    params <- 
        list(
            "login[email]" = your.email , 
            "login[password]" = your.password , 
            "login[is_for_login]" = 1
        )
    
    html <- postForm("https://usa.ipums.org/usa-action/users/validate_login", .params = params, curl = curl)
    dl <- getURL( "https://usa.ipums.org/usa-action/extract_requests/download" , curl = curl)
    
    # Load the DLL you created
    # "writer" is the name of the function
    # "curl_writer" is the name of the dll
    dyn.load("curl_writer.dll")
    writer <- getNativeSymbolInfo("writer", PACKAGE="curl_writer")$address
    
    # Note that "URL" parameter is upper case, in your code it is lowercase
    # I'm not sure if that has something to do
    # "writer" is the symbol defined above
    f <- CFILE(filename <- tempfile(), "wb")
    curlPerform(URL=url, writedata=f@ref, writefunction=writer, curl=curl)
    close(f)
    

this is now possible with the httr package. thanks hadley!

https://github.com/hadley/httr/issues/44

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!