sqldf

Updating data.table by inserting new rows that are different from old rows

为君一笑 提交于 2019-12-02 00:17:44
I have two data.table(dt1 & dt2). dt1 is past product data and dt2 is present product data. I want to create a third data.table that inserts new rows from dt2 into dt1 only when product characteristics(Level or Color) are different or Product itself is different. library(data.table) dt1 <- fread(' Product Level Color ReviewDate A 0 Blue 9/7/2016 B 1 Red 9/7/2016 C 1 Purple 9/7/2016 D 2 Blue 9/7/2016 E 1 Green 9/7/2016 F 4 Yellow 9/7/2016 ') dt2 <- fread(' Product Level Color ReviewDate A 1 Black 9/8/2016 B 1 Red 9/8/2016 C 5 White 9/8/2016 D 2 Blue 9/8/2016 E 1 Green 9/8/2016 F 4 Yellow 9/8

RODBC: merge tables from different databases (channel)

做~自己de王妃 提交于 2019-12-01 18:08:40
I'm using RODBC package to connect to Oracle databases from R but I didn't succeed in merging tables from different databases without "downloading" the tables (I don't want to download them as they are too big!). I'd like to use something like: DBa=odbcConnect(dsn="DatabaseA",uid="uid",pwd="pwd",readOnly="True") DBb=odbcConnect(dsn="DatabaseB",uid="uid",pwd="pwd",readOnly="True") sqldf("select a.year, sum(b.var) as sumVar from sqlFetch(DBa,'tableA') a sqlFetch(DBb,'tableB') b where a.ID=b.ID group by a.year") If someone has an idea, it would be really helpful! Many thanks in advance. Lionel

sqldf: query data by range of dates

倾然丶 夕夏残阳落幕 提交于 2019-12-01 16:53:14
I am reading from a huge text file that has '%d/%m/%Y' date format. I want to use read.csv.sql of sqldf to read and filter the data by date at the same time. This is to save memory usage and run time by skipping many dates that I am not interested in. I know how to do this with the help of dplyr and lubridate , but I just want to try with sqldf for the aforementioned reason. Even though I am quite familiar with SQL syntax, it still gets me most of the time, no exception with sqldf . Running command like following returned a data.frame with 0 row: first_date <- "2001-11-1" second_date <- "2003

Pass R variable to a sql statement

笑着哭i 提交于 2019-12-01 12:34:17
Is there any way to pass a defined variable in R to the SQL statement within the sqldf package? i have to run the code below and I passed the 'v' variable to sql select statement as '$v' for (i in 1:50){ v <- i+ 450 temp <- sqldf("select count(V1) from file_new where V1='$v' ") } Although it runs, it returns wrong result. [The result should be 1000 but this code returns 0]. Hence, I think it doesn't pass the variable value. If v is an integer then you don't want to enclose the $v with single quotes - that makes it a string value. Try without the single quotes. temp <- fn$sqldf("select count(V1

Error: No Such Column using SQLDF

我只是一个虾纸丫 提交于 2019-11-30 23:19:40
Below are the scripts > library(sqldf) > turnover = read.csv("turnover.csv") > names(turnover) [1] "Report.Date" "PersID" "Status" "DOB" [5] "Age" "Tenure" "Current.Hire.Date" "Term.Date" [9] "Gender" "Function" "Grade" "Job.Category" [13] "City" "State" "Retiree" "Race" > turnover_hiredate = sqldf("select Status, Current.Hire.Date from turnover") I get an error msg: no such column: Current.Hire.Date. But this variable is listed as the 7th variable. What did I do wrong? sqldf(...) does not like . (period) in column names, so you need to change it to something else. Try this: library(sqldf)

How to calculate number of occurrences per minute for a large dataset

萝らか妹 提交于 2019-11-30 21:40:20
I have a dataset with 500k appointments lasting between 5 and 60 minutes. tdata <- structure(list(Start = structure(c(1325493000, 1325493600, 1325494200, 1325494800, 1325494800, 1325495400, 1325495400, 1325496000, 1325496000, 1325496600, 1325496600, 1325497500, 1325497500, 1325498100, 1325498100, 1325498400, 1325498700, 1325498700, 1325499000, 1325499300), class = c("POSIXct", "POSIXt"), tzone = "GMT"), End = structure(c(1325493600, 1325494200, 1325494500, 1325495400, 1325495400, 1325496000, 1325496000, 1325496600, 1325496600, 1325496900, 1325496900, 1325498100, 1325498100, 1325498400,

Error: No Such Column using SQLDF

南笙酒味 提交于 2019-11-30 19:29:56
问题 Below are the scripts > library(sqldf) > turnover = read.csv("turnover.csv") > names(turnover) [1] "Report.Date" "PersID" "Status" "DOB" [5] "Age" "Tenure" "Current.Hire.Date" "Term.Date" [9] "Gender" "Function" "Grade" "Job.Category" [13] "City" "State" "Retiree" "Race" > turnover_hiredate = sqldf("select Status, Current.Hire.Date from turnover") I get an error msg: no such column: Current.Hire.Date. But this variable is listed as the 7th variable. What did I do wrong? 回答1: sqldf(...) does

Using sqldf and RPostgreSQL together

不打扰是莪最后的温柔 提交于 2019-11-30 08:33:20
When using RPostgreSQL I find that I cannot use sqldf in the same way. For example if I load the library and read in data into a data frame using the following code: library(RPostgreSQL) drv <- dbDriver("PostgreSQL") con <- dbConnect(drv, host="localhost", user="postgres", password="xxx", dbname="yyy", port="5436") rs <- dbSendQuery(con, "select * from table"); df<- fetch(rs, n = -1); dbClearResult(rs) dbDisconnect(con) I know have the contents of this table in the dataframe df . However if I want to run a SQL command using sqldf I would previously do something like this: sqldf("SELECT * FROM

How to calculate number of occurrences per minute for a large dataset

一世执手 提交于 2019-11-30 05:42:34
问题 I have a dataset with 500k appointments lasting between 5 and 60 minutes. tdata <- structure(list(Start = structure(c(1325493000, 1325493600, 1325494200, 1325494800, 1325494800, 1325495400, 1325495400, 1325496000, 1325496000, 1325496600, 1325496600, 1325497500, 1325497500, 1325498100, 1325498100, 1325498400, 1325498700, 1325498700, 1325499000, 1325499300), class = c("POSIXct", "POSIXt"), tzone = "GMT"), End = structure(c(1325493600, 1325494200, 1325494500, 1325495400, 1325495400, 1325496000,

R: Date function in sqldf giving unusual answer (wrong date format?)

吃可爱长大的小学妹 提交于 2019-11-29 17:11:36
I am trying to add to a date using sqldf, i know it should be simple but I can't figure out what is wrong with my date format. Using: sqldf("select date(model_date, '+1 day') from lapse_test") give's answers like '-4666-01-23' The model_date's are in the date format and look like 2015-01-01 I previously made them from a character string ('12/1/2015') using lapse_test$model_date <- as.Date(lapse_test$date1,format = "%m/%d/%Y") or lapse_test$model_date <- as.POSIXCT(lapse_test$date1,format = "%m/%d/%Y") I'm guessing this is the problem? Any ideas? Passing a character variable to the date()