How to read when delimiter is space and missing values are blank?

夙愿已清 提交于 2021-01-24 06:54:56

问题


I have a space delimited file and some columns are blank, so we end up having multiple spaces, and fread fails with error. But read.table works fine. See example:

library(data.table)
# R version 3.4.2 (2017-09-28)
# data.table_1.10.4-3

fread("A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)
Error in fread("A B C D\n1 2  3\n4 5 6 7") : 
  Expected sep (' ') but new line, EOF (or other non printing character) ends field 2 when detecting types from point 0: 1 2  3
read.table(text ="A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)
#   A B  C D
# 1 1 2 NA 3
# 2 4 5  6 7

How do we read using fread, I tried setting sep = " " and na.string = "", didn't help.


回答1:


In fread function, by default strip.white is set to TRUE, meaning leading trailing spaces are removed. That is useful to read files with fixed width or with irregular number of spaces as separator.

Whereas in read.table strip.white by default is set to FALSE.

fread("A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE, strip.white = FALSE)
#    A B  C D
# 1: 1 2 NA 3
# 2: 4 5  6 7

Note: Providing self-answer as I couldn't find relevant post, also this tripped me over once and twice.


Edit: This doesn't work anymore for data.table_1.12.2, related GitHub Issue.



来源:https://stackoverflow.com/questions/48215177/how-to-read-when-delimiter-is-space-and-missing-values-are-blank

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!