I have a file, called genes.txt
, which I\'d like to become a data.frame. It\'s got a lot of lines, each line has three, tab delimited fields:
mi
With read.table one of the default quote characters is the single quote. I'm guessing you have some unmatched single quotes in your description field and all the data between single quotes is being pooled together into one entry.
With read.delim the defualt quote character is the double quote and thus this isn't a problem.
Specify your quote character and you should be all set.
> genes<-read.table("genes.txt",sep="\t",quote="\"",na.strings="-",fill=TRUE, col.names=c("GeneSymbol","synonyms","description"))
> nrow(genes)
[1] 42476