Reading text file with multiple space as delimiter in R

后端 未结 3 1004

I have big data set which consist of around 94 columns and 3 Million rows. This file have single as well as multiple spaces as delimiter between columns. I need to read som

相关标签:
3条回答
  • 2020-12-08 00:30

    If you want to use the tidyverse (or readr respectively) package instead, you can use read_table instead.

    read_table(file, col_names = TRUE, col_types = NULL,
      locale = default_locale(), na = "NA", skip = 0, n_max = Inf,
      guess_max = min(n_max, 1000), progress = show_progress(), comment = "")
    

    And see here in the description:

    read_table() and read_table2() are designed to read the type of textual data where
    each column is #' separate by one (or more) columns of space.
    
    0 讨论(0)
  • 2020-12-08 00:45

    If you field have a fixed width, you should consider using read.fwf() which might handle missing values better.

    0 讨论(0)
  • 2020-12-08 00:51

    You need to change your delimiter. " " refers to one whitespace character. "" refers to any length whitespace as being the delimiter

     data <- read.table(file, sep = "" , header = F , nrows = 100,
                         na.strings ="", stringsAsFactors= F)
    

    From the manual:

    If sep = "" (the default for read.table) the separator is ‘white space’, that is one or more spaces, tabs, newlines or carriage returns.

    Also, with a large datafile you may want to consider data.table:::fread to quickly read data straight into a data.table. I was myself using this function this morning. It is still experimental, but I find it works very well indeed.

    0 讨论(0)
提交回复
热议问题