发表新帖

发表新帖

Specifying Column Types when Importing xlsx Data to R with Package readxl

后端未结

关注

 6  1479

梦如初夏 2020-12-13 07:40

I\'m importing xlsx 2007 tables into R 3.2.1patched using package readxl 0.1.0 under Windows 7 64. The tables\' size is

6条回答

天涯浪人 (楼主)

2020-12-13 08:08
It depends on whether your data is sparse in different places in different columns, and how sparse it is. I found that having more rows didn't improve the parsing: the majority were still blank, and interpreted as text, even if later on they become dates, etc..

One work-around is to generate the first data row of your excel table to include representative data for every column, and use that to guess column types. I don't like this because I want to leave the original data intact.

Another workaround, if you have complete rows somewhere in the spreadsheet, is to use nskip instead of n. This gives the starting point for the column guessing. Say data row 117 has a full set of data:
```
readxl:::xlsx_col_types(path = "a.xlsx", nskip = 116, n = 1)
```
Note that you can call the function directly, without having to edit the function in the namespace.

You can then use the vector of spreadsheet types to call read_excel:
```
col_types <- readxl:::xlsx_col_types(path = "a.xlsx", nskip = 116, n = 1)
dat <- readxl::read_excel(path = "a.xlsx", col_types = col_types)
```
Then you can manually update any columns which it still gets wrong.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题