问题
Given several .xls files with varying number of sheets, I am reading them into R usingread.xls from the gdata package. I have two related issues (solving the second issue should solve the first):
- It is unknown ahead of time how many sheets each
.xlsfile will have, and in fact this value will vary from one file to the next. - I need to capture the name of the sheet, which is relevant data
Right now, to resolve (1), I am using try() and iterating over sheet numbers until I hit an error.
How can I grab a list of the names of the sheet so that I can iterate over them?
回答1:
See the sheetCount and sheetNames functions (on same help page) in gdata. If xls <- "a.xls", say, then reading all sheets of a spreadsheet into a list, one sheet per component, is just this:
sapply(sheetNames(xls), read.xls, xls = xls, simplify = FALSE)
Note that the components will be named using the names of the sheets. Depending on the content it might make sense to remove simplify = FALSE.
回答2:
For such tasks I use library XLConnect. With its functions you can get the names of each sheet in a vector and then just determine the length of that vector.
#Read your workbook
wb<-loadWorkbook("Your_workbook.xls")
#Save each sheet's name as a vector
lp<-getSheets(wb)
#Now read each sheet as separate list element
dat<-lapply(seq_along(lp),function(i) readWorksheet(wb,sheet=lp[i]))
UPDATE
As suggested by @Martin Studer XLConnect functions are already vectorized, so there is no need to use lapply(), instead just provide vector of sheet names or use function getSheets() inside readWorksheet().
dat <- readWorksheet(wb, sheet = getSheets(wb))
来源:https://stackoverflow.com/questions/15680782/read-xls-read-in-variable-length-list-of-sheets-with-their-names