read.table

Reading in multiple CSVs with different numbers of lines to skip at start of file

雨燕双飞 提交于 2020-01-12 04:45:14
问题 I have to read in about 300 individual CSVs. I have managed to automate the process using a loop and structured CSV names. However each CSV has 14-17 lines of rubbish at the start and it varies randomly so hard coding a 'skip' parameter in the read.table command won't work. The column names and number of columns is the same for each CSV. Here is an example of what I am up against: QUICK STATISTICS: Directory: Data,,,, File: Final_Comp_Zn_1 Selection: SEL{Ox*1000+Doma=1201} Weight: None,,, ,

read.table in Chunks - error message

╄→гoц情女王★ 提交于 2020-01-06 01:17:06
问题 I have a large file with 6mil rows and I'm trying to read the data in chunks for processing so I don't hit my RAM limit. Here is my code (note temp.csv is just a dummy file with 41 records): infile <- file("data/temp.csv", open="r") headers <- as.character(read.table(infile, header = FALSE, nrows=1, sep=",", stringsAsFactors=FALSE)) while(length(temp <-read.table(infile, header = FALSE, nrows=10, sep=",", stringsAsFactors=FALSE)) > 0){ temp <- data.table(temp) setnames(temp, colnames(temp),

Add selection crteria to read.table

落花浮王杯 提交于 2020-01-04 04:53:10
问题 Let's take the following simplified version of a dataset that I import using read.table : a<-as.data.frame(c("M","M","F","F","F")) b<-as.data.frame(c(25,22,33,17,18)) df<-cbind(a,b) colnames(df)<-c("Sex","Age") In reality my dataset is extremely large and I'm only interested in a small proportion of the data i.e. the data concerning Females aged 18 or under. In the example above this would be just the last 2 observations. My question is, can I just import these observations immediately

read.table with comma separated values and also commas inside each element

自古美人都是妖i 提交于 2020-01-01 09:45:15
问题 I'm trying to create a table from a csv file comma separated. I'm aware that not all the rows have the same number of elements so I would write some code to eliminate those rows. The problem is that there are rows that include numbers (in thousands) which include another comma as well. I'm not capable of splitting those rows properly, here's my code: pURL <- "http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t=EI&region=FRA&order=asc" res <- read.table(pURL, header=T, sep='

read.table with comma separated values and also commas inside each element

房东的猫 提交于 2020-01-01 09:45:11
问题 I'm trying to create a table from a csv file comma separated. I'm aware that not all the rows have the same number of elements so I would write some code to eliminate those rows. The problem is that there are rows that include numbers (in thousands) which include another comma as well. I'm not capable of splitting those rows properly, here's my code: pURL <- "http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t=EI&region=FRA&order=asc" res <- read.table(pURL, header=T, sep='

How to index an element of a list object in R

核能气质少年 提交于 2019-12-28 02:44:08
问题 I'm doing the following in order to import some txt tables and keep them as list: # set working directory - the folder where all selection tables are stored hypo_selections<-list.files() # change object name according to each species hypo_list<-lapply(hypo_selections,read.table,sep="\t",header=T) # change object name according to each species I want to access one specific element, let's say hypo_list[1]. Since each element represents a table, how should I procced to access particular cells

Row limit in read.table.ffdf?

人盡茶涼 提交于 2019-12-25 06:29:58
问题 I'm trying to import a very large dataset (101 GB) from a text file using read.table.ffdf in package ff. The dataset has >285 million records, but I am only able to read in the first 169,457,332 rows. The dataset is tab-separated with 44 variable-width columns. I've searched stackoverflow and other message boards and have tried many fixes, but still am consistently only able to import the same number of records. Here's my code: relFeb2016.test <- read.table.ffdf(x = NULL, file="D:/eBird/ebd

Row limit in read.table.ffdf?

让人想犯罪 __ 提交于 2019-12-25 06:27:07
问题 I'm trying to import a very large dataset (101 GB) from a text file using read.table.ffdf in package ff. The dataset has >285 million records, but I am only able to read in the first 169,457,332 rows. The dataset is tab-separated with 44 variable-width columns. I've searched stackoverflow and other message boards and have tried many fixes, but still am consistently only able to import the same number of records. Here's my code: relFeb2016.test <- read.table.ffdf(x = NULL, file="D:/eBird/ebd

Replace “\” with “/” in r [duplicate]

 ̄綄美尐妖づ 提交于 2019-12-24 16:45:57
问题 This question already has answers here : Efficiently convert backslash to forward slash in R (9 answers) Closed last year . I'm trying to replace "\" with "/" or "\\" in R. fp = "C:\users\jordan\Documents\Computer Science\R\miscData.txt" replace(fp, "\", "\\") Output: > fp = "C:\users\jordan\Documents\Computer Science\R\miscData.txt" Error: '\u' used without hex digits in character string starting ""C:\u" Obviously, "\" is an escape character and can't be used this way. Is there a way to

Read in certain numbers as NA in R with `data.table::fread`

夙愿已清 提交于 2019-12-24 15:43:46
问题 I am reading in some files created by another program. The program populates entries with missing values with the number -99.9 . I am speeding up some code from using base read.table() to fread() from the data.table package to read in these data files. I am able to use na.strings=c(-99.9) in read.table , but fread does not seem to accept numeric arguments for na.strings . The string counterpart, na.strings=c("-99.9") , does not work, and gives me the error: 'na.strings' is type ' . Can I make