How to import data with line breaks from text file into R?

懵懂的女人 提交于 2019-12-04 17:24:10

While the data is ill-formed it still can be parsed given the following assumptions:

  • The header defines how many variables there are (columns in the resultant table)
  • The data itself is complete - e.g. there are no missing values
  • The data is of a uniform type (e.g. numeric())

The following is code that parses the provided sample data as if it were read in from a text file called data.txt:

# read in the header and split on ","
header = strsplit(readLines('data.txt', n=1), ',')[[1]]

# the length of the header determines how many variables there are

# read in the data which appears to have the pattern
#   <numbers><whitespace><numbers>...
# skipping the first line since it was already parsed as the header
data = scan('data.txt', skip=1, what=numeric())

# reform the data (which is read in as a 1D numeric vector) into a 2D matrix
# with the same number of columns as there are headers (filling by rows).
# header names are assigned via the `dimnames=` argument
data = matrix(data, ncol=length(header), byrow=T, dimnames=list(NULL, header))

producing the following output:

       x1  x2     x3     x4   x5   x6   x7   x8   x9  x10  x11
[1,] 1953 7.4 159565 16.668 8883 47.2 26.7 16.8 37.7 29.7 19.4
[2,] 1954 7.8 162391 17.029 8685 46.5 22.7 18.0 36.8 29.7 20.0

Maybe you could manually edit the first line (change , to " " and insert a line break) and then try again?

Use read.csv instead of read.table and then add skip=1, header=FALSE to the arguments to read.csv.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!