I am using R to process Census data which uses really long numeric GEOIDs to identify the geographies. The issue I am facing is when writing out the processed data using w
It would probably be safer to use character values:
X <- tbl_df(data.frame(GEOID = as.character(seq(from=60150001022000, to=60150001022005))))
write_csv(X, "test.csv")
It's a bit ironic that the write_csv function does coerce some of its output to character values, but not numeric columns. Only if a column passes the is.object
test will it be coerced. There does not appear to be a switch to throw that will preserve maximal precision. The write.table
and its offspring write.csv
functions have several switches that allow suppression of quotes and other settings that allow tailoring of output but write_csv
has very little of such.
You can trick write_csv into thinking that a numeric column is something more complex and this does result in the as.character
output, albeit with quotes.
class(X[[1]])<- c("num", "numeric")
vapply(X, is.object, logical(1))
#GEOID
# TRUE
write_csv(X, "")
#[1] #"\"GEOID\"\n\"60150001022000\"\n\"60150001022001\"\n\"60150001022002\"\n\"60150001022003\"\n\"60150001022004\"\n\"60150001022005\"\n"
As a matter of best practices I do not agree with your choice of insisting that ID-variables remain numeric. There is too much violence that can be applied to that storage mode for an object. You do not need any of the arithmetic operations for an ID-variable.