R data.table fread using named colClasses without header (e.g. no col.names?)

我怕爱的太早我们不能终老 提交于 2019-12-10 17:55:37

问题


update (June 2016)

col.names was added on data.table 1.9.6 so issue is over and everyone super happy :) I think I can now convert all my read.csv calls to fread calls without worries of destruction

original question

using data.table 1.9.4

I'm importing read.csv calls to fread due to HUGE performance improvements we've noticed. Most issues I can handle but I've reached a point where I'm clueless and wonder if anyone has an elegent solution.

My problem is that I have named colClasses but the input has no header (it's a grep function), here's a silly example to make sense:

males.students <- read.csv(pipe("grep Male students.csv"), 
                           col.names=c("id", "name", "gender"), 
                           colClasses=(id="numeric"))

now in fread I still want the named colClasses but I have no col names so just using

males.students <- fread("grep Male students.csv"), 
                        colClasses=(id="numeric"))

fails with

Column name 'id' in colClasses[[1]] not found

How can I fix that? are there plans to add col.names?


回答1:


Add the names in the command line:

fread('echo "id,name,gender"; grep Male students.csv', colClasses = c(id='numeric'))



回答2:


Answering the original question, if the problem is that grep removes the header, you could use awk instead, to print the first line and any lines containing "Male":

fread("awk 'NR==1 || /Male/' students.csv"), colClasses=(id="numeric"))

This might help people that still use the old version of data.table.



来源:https://stackoverflow.com/questions/28602337/r-data-table-fread-using-named-colclasses-without-header-e-g-no-col-names

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!