'Embedded nul in string' error when importing csv with fread

匿名 (未验证) 提交于 2019-12-03 02:11:02

问题:

I have a large file (3.5G) that I'm trying to import using data.table::fread.

It was originally created from an rpt file that was opened as text and saved as a CSV.

This has worked fine with smaller files (of the same type of data-same columns and all. This one is just for a longer timeframe and wider reach).

When I try and run

mydata <- fread("mycsv.csv") 

I get the error:

Error in fread("mycsv.csv") : embedded nul in string: 'y\0e\0a\0r\0'

What does this mean?

回答1:

We can remove the null terminators on the command line using something like:

sed 's/\\0//g' mycsv.csv > mycsv.csv 

Or as suggested by @marbel, fread allows you to pass the sed call inside the text. Such as:

fread("sed 's/\\0//g' mycsv.csv") 


回答2:

You can test this small function:

cleanFiles<-function(file,newfile){   writeLines(iconv(readLines(file,skipNul = TRUE)),newfile) } 

It's work for me



回答3:

In this case, you can use read.csv with fileEncoding of UTF-16LE rather than fread.

read.csv("mycsv.csv",fileEncoding="UTF-16LE") 

Considering your data size, using read.csv would take a couple of minutes, but I think it is not a big deal.



回答4:

A non-technical way to solve this would be, to

  1. Open the problematic .csv

  2. Ctrl+A (Select all)

  3. Open new Excel sheet

  4. Right click and choose 'Paste as values'

  5. Save and use this file in place of original one.

Worked for me, and doesn't take much time.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!