Read an Excel file directly from a R script

后端 未结 12 884
误落风尘
误落风尘 2020-11-22 13:52

How can I read an Excel file directly into R? Or should I first export the data to a text- or CSV file and import that file into R?

12条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 14:32

    EDIT 2015-October: As others have commented here the openxlsx and readxl packages are by far faster than the xlsx package and actually manage to open larger Excel files (>1500 rows & > 120 columns). @MichaelChirico demonstrates that readxl is better when speed is preferred and openxlsx replaces the functionality provided by the xlsx package. If you are looking for a package to read, write, and modify Excel files in 2015, pick the openxlsx instead of xlsx.

    Pre-2015: I have used xlsxpackage. It changed my workflow with Excel and R. No more annoying pop-ups asking, if I am sure that I want to save my Excel sheet in .txt format. The package also writes Excel files.

    However, I find read.xlsx function slow, when opening large Excel files. read.xlsx2 function is considerably faster, but does not quess the vector class of data.frame columns. You have to use colClasses command to specify desired column classes, if you use read.xlsx2 function. Here is a practical example:

    read.xlsx("filename.xlsx", 1) reads your file and makes the data.frame column classes nearly useful, but is very slow for large data sets. Works also for .xls files.

    read.xlsx2("filename.xlsx", 1) is faster, but you will have to define column classes manually. A shortcut is to run the command twice (see the example below). character specification converts your columns to factors. Use Dateand POSIXct options for time.

    coln <- function(x){y <- rbind(seq(1,ncol(x))); colnames(y) <- colnames(x)
    rownames(y) <- "col.number"; return(y)} # A function to see column numbers
    
    data <- read.xlsx2("filename.xlsx", 1) # Open the file 
    
    coln(data)    # Check the column numbers you want to have as factors
    
    x <- 3 # Say you want columns 1-3 as factors, the rest numeric
    
    data <- read.xlsx2("filename.xlsx", 1, colClasses= c(rep("character", x),
    rep("numeric", ncol(data)-x+1)))
    

提交回复
热议问题