I use RMySQL and a MySQL database to store my datasets. Sometimes data gets revised or I store results back to the database as well. Long story short, there is quite some in
Ok, I got a working solution now. Here's a function that maps MySQL field types to R classes. This helps in particular handling the MySQL field type date...
dbReadMap <- function(con,table){
statement <- paste("DESCRIBE ",table,sep="")
desc <- dbGetQuery(con=con,statement)[,1:2]
# strip row_names if exists because it's an attribute and not real column
# otherweise it causes problems with the row count if the table has a row_names col
if(length(grep(pattern="row_names",x=desc)) != 0){
x <- grep(pattern="row_names",x=desc)
desc <- desc[-x,]
}
# replace length output in brackets that is returned by describe
desc[,2] <- gsub("[^a-z]","",desc[,2])
# building a dictionary
fieldtypes <- c("int","tinyint","bigint","float","double","date","character","varchar","text")
rclasses <- c("as.numeric","as.numeric","as.numeric","as.numeric","as.numeric","as.Date","as.character","as.character","as.character")
fieldtype_to_rclass = cbind(fieldtypes,rclasses)
map <- merge(fieldtype_to_rclass,desc,by.x="fieldtypes",by.y="Type")
map$rclasses <- as.character(map$rclasses)
#get data
res <- dbReadTable(con=con,table)
i=1
for(i in 1:length(map$rclasses)) {
cvn <- call(map$rclasses[i],res[,map$Field[i]])
res[map$Field[i]] <- eval(cvn)
}
return(res)
}
Maybe this is not good programming practice – I just don't know any better. So, use it at your own risk or help me to improve it... And of course it's only half of it: reading
. Hopefully I´ll find some time to write a writing function soon.
If you have suggestions for the mapping dictionary let me know :)
Here is a more generic function of the function of @Matt Bannert
that works with queries instead of tables:
# Extension to dbGetQuery2 that understands MySQL data types
dbGetQuery2 <- function(con,query){
statement <- paste0("CREATE TEMPORARY TABLE `temp` ", query)
dbSendQuery(con, statement)
desc <- dbGetQuery(con, "DESCRIBE `temp`")[,1:2]
dbSendQuery(con, "DROP TABLE `temp`")
# strip row_names if exists because it's an attribute and not real column
# otherweise it causes problems with the row count if the table has a row_names col
if(length(grep(pattern="row_names",x=desc)) != 0){
x <- grep(pattern="row_names",x=desc)
desc <- desc[-x,]
}
# replace length output in brackets that is returned by describe
desc[,2] <- gsub("[^a-z]","",desc[,2])
# building a dictionary
fieldtypes <- c("int", "tinyint", "bigint", "float", "double", "date", "character", "varchar", "text")
rclasses <- c("as.numeric", "as.numeric", "as.numeric", "as.numeric", "as.numeric", "as.Date", "as.character", "as.factor", "as.character")
fieldtype_to_rclass = cbind(fieldtypes,rclasses)
map <- merge(fieldtype_to_rclass,desc,by.x="fieldtypes",by.y="Type")
map$rclasses <- as.character(map$rclasses)
#get data
res <- dbGetQuery(con,query)
i=1
for(i in 1:length(map$rclasses)) {
cvn <- call(map$rclasses[i],res[,map$Field[i]])
res[map$Field[i]] <- eval(cvn)
}
return(res)
}