Reading text file with multiple space as delimiter in R

后端未结

关注

 3  1011

I have big data set which consist of around 94 columns and 3 Million rows. This file have single as well as multiple spaces as delimiter between columns. I need to read som

相关标签:

3条回答

孤街浪徒

2020-12-08 00:30

If you want to use the tidyverse (or readr respectively) package instead, you can use read_table instead.

read_table(file, col_names = TRUE, col_types = NULL,
  locale = default_locale(), na = "NA", skip = 0, n_max = Inf,
  guess_max = min(n_max, 1000), progress = show_progress(), comment = "")

And see here in the description:

read_table() and read_table2() are designed to read the type of textual data where
each column is #' separate by one (or more) columns of space.

0 讨论(0)

你的背包

2020-12-08 00:45

If you field have a fixed width, you should consider using read.fwf() which might handle missing values better.

0 讨论(0)
发布评论:

提交评论
- 加载中...
栀梦

2020-12-08 00:51
You need to change your delimiter. " " refers to one whitespace character. "" refers to any length whitespace as being the delimiter
```
 data <- read.table(file, sep = "" , header = F , nrows = 100,
                     na.strings ="", stringsAsFactors= F)
```
From the manual:

If sep = "" (the default for read.table) the separator is ‘white space’, that is one or more spaces, tabs, newlines or carriage returns.

Also, with a large datafile you may want to consider data.table:::fread to quickly read data straight into a data.table. I was myself using this function this morning. It is still experimental, but I find it works very well indeed.
0 讨论(0)
发布评论:

提交评论
- 加载中...