Specifying Column Types when Importing xlsx Data to R with Package readxl

后端 未结 6 1480
梦如初夏
梦如初夏 2020-12-13 07:40

I\'m importing xlsx 2007 tables into R 3.2.1patched using package readxl 0.1.0 under Windows 7 64. The tables\' size is

6条回答
  •  天涯浪人
    2020-12-13 07:55

    New solution since readxl version 1.x:

    The solution in the currently preferred answer does no longer work with newer versions than 0.1.0 of readxl since the used package-internal function readxl:::xlsx_col_types does no longer exist.

    The new solution is to use the newly introduced parameter guess_max to increase the number of rows used to "guess" the appropriate data type of the columns:

    read_excel("My_Excel_file.xlsx", sheet = 1, guess_max = 1048576)
    

    The value 1,048,576 is the maximum number of lines supported by Excel currently, see the Excel specs: https://support.office.com/en-us/article/Excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3

    PS: If you care about performance using all rows to guess the data type: read_excel seems to read the file only once and the guess is done in-memory then so the performance penalty is very small compared to the saved work.

提交回复
热议问题