可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a large file (3.5G) that I'm trying to import using data.table::fread.
It was originally created from an rpt file that was opened as text and saved as a CSV.
This has worked fine with smaller files (of the same type of data-same columns and all. This one is just for a longer timeframe and wider reach).
When I try and run
mydata <- fread("mycsv.csv")
I get the error:
Error in fread("mycsv.csv") : embedded nul in string: 'y\0e\0a\0r\0'
What does this mean?
回答1:
We can remove the null terminators on the command line using something like:
sed 's/\\0//g' mycsv.csv > mycsv.csv
Or as suggested by @marbel, fread allows you to pass the sed call inside the text. Such as:
fread("sed 's/\\0//g' mycsv.csv")
回答2:
You can test this small function:
cleanFiles<-function(file,newfile){ writeLines(iconv(readLines(file,skipNul = TRUE)),newfile) }
It's work for me
回答3:
In this case, you can use read.csv with fileEncoding of UTF-16LE rather than fread.
read.csv("mycsv.csv",fileEncoding="UTF-16LE")
Considering your data size, using read.csv would take a couple of minutes, but I think it is not a big deal.
回答4:
A non-technical way to solve this would be, to
Open the problematic .csv
Ctrl+A (Select all)
Open new Excel sheet
Right click and choose 'Paste as values'
Save and use this file in place of original one.
Worked for me, and doesn't take much time.