I have a csv file which I read using the following function:
csvData <- read.csv(file=\"pf.csv\", colClasses=c(NA, NA,\"NULL\",NA,\"NULL\",NA,\"NULL\",\"
For me the sqldf package's read.csv.sql looked great at first blush. But when I tried to use it, it failed to deal with "NULL" strings. (Others have found this out as well.) Unfortunately, it doesn't support all of read.csv features. So I had to write my own. I am surprised that there isn't a good package for this.
fetchLines=function(inputFile,match,fixed=T,n=100,maxlines=100000){ #inputFile='simple.csv'; match='APPLE';
message('reading:',inputFile)
n=min(n,maxlines)
con <- base::file(inputFile, open = "r",encoding = "UTF-8-BOM")
data=c(readLines(con, n = 1, warn = FALSE))
while (length(oneLine <- readLines(con, n = n, warn = FALSE)) > 0) {
grab=grep(match,oneLine,value=T,fixed=fixed)
if(length(grab)>0){
data=c(data,grab)
if(length(data)>maxlines){
warning("bailing out too many");
return(data);
}
cat('.')
}
}
close(con)
gc()
cat("\n")
data;
}
#To avoid: argument 'object' must deparse to a single character string
fdata=textConnection( fetchLines("datafile.csv",'\\bP58\\b',fixed=F,maxlines = 100000))
df<-read.csv(fdata,header=T,sep=",",na.strings = c('NULL',''),fileEncoding = "UTF-8-BOM",stringsAsFactors = F)
R textConnection: "argument 'object' must deparse to a single character string"