问题
I have a csv file where column names include spaces and special characters.
fread
imports them with quotes - but how can I change this behaviour? One reason is that I have column names starting with a space and I don't know how to handle them.
Any pointers would be helpful.
Edit: An example.
> packageVersion("data.table")
[1] ‘1.8.8’
p2p <- fread("p2p.csv", header = TRUE, stringsAsFactors=FALSE)
> head(p2p[,list(Principal remaining)])
Error: unexpected symbol in "head(p2p[,list(Principal remaining"
> head(p2p[,list("Principal remaining")])
V1
1: Principal remaining
> head(p2p[,list(c("Principal remaining"))])
V1
1: Principal remaining
What I was expecting/want is of course, what a column name without spaces yields:
> head(p2p[,list(Principal)])
Principal
1: 1000
2: 1000
3: 1000
4: 2000
5: 1000
6: 4130
回答1:
It should be rather difficult to get a leading space in a column name. Should not happen by "casual coding". On the other hand I don't see very much error checking in the fread
code, so maybe until this undesirable behavior is fixed, (or the feature request refused), you can do something like this:
setnames(DT, make.names(colnames(DT)))
If on the other hand you are bothered by the fact that colnames(DT)
will display the column names with quotes then just "get over it." That's how the interactive console will display any character value.
If you have a data item in a character column that looks like " ttt"
in the original, then it's going to have leading spaces when imported, and you need to process it with colnames(dfrm) <- sub("^\\s+", "", colnames(dfrm))
or one of the several trim
functions in various packages (such as 'gdata')
回答2:
A little bit modified BondedDust version, because setnames function is not used with <- sign:
setnames(DT, make.names(colnames(DT))
回答3:
You can use argument check.names=T in fread function of data.table
p2p <- fread("p2p.csv", header = TRUE, stringsAsFactors=FALSE, check.names=T)
It uses make.names function in background
default is FALSE. If TRUE then the names of the variables in the data.table
are checked to ensure that they are syntactically valid variable names. If
necessary they are adjusted (by make.names) so that they are, and also to
ensure that there are no duplicates.
来源:https://stackoverflow.com/questions/16966957/fread-from-data-table-package-when-column-names-include-spaces-and-special-chara