dbplyr

How to use EXTRACT through dbplyr when connecting to an Oracle DB

扶醉桌前 提交于 2019-12-23 20:37:51
问题 Take this query: SELECT EXTRACT(month FROM order_date) "Month" FROM orders (simplified example from official oracle doc) How would you go at integrating such EXTRACT operations above in a dbplyr chain ? I'm open to any other workaround (even ugly/costly) to extract the month on server side. 回答1: More elegant: tbl(con, "orders") %>% mutate(Month = extract(NULL %month from% order_date)) This results in the following SQL (ANSI SQL): EXTRACT( MONTH FROM "order_date") This trick works because the

How to pass data.frame into SQL “IN” condition using R?

霸气de小男生 提交于 2019-12-13 03:49:26
问题 I am reading list of values from CSV file in R, and trying to pass the values into IN condition of SQL(dbGetQuery). Can some one help me out with this? library(rJava) library(RJDBC) library(dbplyr) library(tibble) library(DBI) library(RODBC) library(data.table) jdbcDriver <- JDBC("oracle.jdbc.OracleDriver",classPath="C://Users/********/Oracle_JDBC/ojdbc6.jar") jdbcConnection <- dbConnect(jdbcDriver, "jdbc:oracle:thin:Rahul@//Host/DB", "User_name", "Password") ## Setting working directory for

Apply a ranking window function in dbplyr backend

不羁的心 提交于 2019-12-11 17:34:53
问题 I want to seamlessly identify new orders (acquisitions) and returns in my transactional database table. This sounds like the perfect job for a window function; I would like to perform this operation in dbplyr . My current process is to: Create a query object I then use into dbGetQuery() ; this query contains a standard rank() window function as usually seen in postgresql Ingest this query into my R environment Then using an ifelse() function into the mutate() verb, I identify the first orders

Issue with dbplyr::spread() on tbl_sql

Deadly 提交于 2019-12-08 07:20:40
问题 This is a specific issue of the following dev version of dbplyr: devtools::install_github("tidyverse/dbplyr", ref = devtools::github_pull(72)) developed by @edgararuiz It seems to me that the spread function doesn't work properly... df_sample <- tribble(~group1, ~group2, ~group3, ~identifier, ~value, 8, 24, 6, 'mt_0', 12, 18, 24, 6, 'mt_1', 4) con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") df_db <- copy_to(con, df_sample, 'df_sample') I obtained an incorrect result with the following

Joining across databases with dbplyr

不羁岁月 提交于 2019-12-08 04:25:58
问题 I am working with database tables with dbplyr I have a local table and want to join it with a large (150m rows) table on the database The database PRODUCTION is read only # Set up the connection and point to the table library(odbc); library(dbplyr) my_conn_string <- paste("Driver={Teradata};DBCName=teradata2690;DATABASE=PRODUCTION;UID=", t2690_username,";PWD=",t2690_password, sep="") t2690 <- dbConnect(odbc::odbc(), .connection_string=my_conn_string) order_line <- tbl(t2690, "order_line")

How to use a window function to determine when to perform different tasks?

蓝咒 提交于 2019-12-06 09:44:57
问题 Note: Similar question I have asked for SQL - How to use a window function to determine when to perform different tasks in Hive or Postgres? Data I have a some data showing the start day and end day for different pre-prioritised tasks per person: input_df <- data.frame(person = c(rep("Kate", 2), rep("Adam", 2), rep("Eve", 2), rep("Jason", 5)), task_key = c(c("A","B"), c("A","B"), c("A","B"), c("A","B","C","D","E")), start_day = c(c(1L,1L), c(1L,2L), c(2L,1L), c(1L,4L,3L,5L,4L)), end_day = 5L)

How to spread tbl_dbi and tbl_sql data without downloading to local memory

北城余情 提交于 2019-12-06 06:23:02
问题 I am working with large datasets and tidyr's spread usually gives me error messages suggesting failure to obtain memory to perform the operation. Therefore, I have been exploring dbplyr. However, as it says here, and also shown below, dbplyr::spread() does not work. My question here is whether there is another way to accomplish what tidyr::spread does while working with tbl_dbi and tbl_sql data without downloading to local memory. Using sample data from here, below I present what I get and

Adding column to sqlite database

好久不见. 提交于 2019-12-06 04:39:09
问题 I am trying to add a vector which I generated in R to a sqlite table as a new column. For this I wanted to use dplyr (I installed the most recent dev. version along with the dbplyr package according to this post here). What I tried: library(dplyr) library(DBI) #creating initial database and table dbcon <- dbConnect(RSQLite::SQLite(), "cars.db") dbWriteTable(dbcon, name = "cars", value = cars) cars_tbl <- dplyr::tbl(dbcon, "cars") #new values which I want to add as a new column new_values <-

How to use a window function to determine when to perform different tasks?

两盒软妹~` 提交于 2019-12-04 18:02:22
Note: Similar question I have asked for SQL - How to use a window function to determine when to perform different tasks in Hive or Postgres? Data I have a some data showing the start day and end day for different pre-prioritised tasks per person: input_df <- data.frame(person = c(rep("Kate", 2), rep("Adam", 2), rep("Eve", 2), rep("Jason", 5)), task_key = c(c("A","B"), c("A","B"), c("A","B"), c("A","B","C","D","E")), start_day = c(c(1L,1L), c(1L,2L), c(2L,1L), c(1L,4L,3L,5L,4L)), end_day = 5L) person task_key start_day end_day 1 Kate A 1 5 2 Kate B 1 5 3 Adam A 1 5 4 Adam B 2 5 5 Eve A 2 5 6

How to spread tbl_dbi and tbl_sql data without downloading to local memory

不想你离开。 提交于 2019-12-04 12:59:36
I am working with large datasets and tidyr's spread usually gives me error messages suggesting failure to obtain memory to perform the operation. Therefore, I have been exploring dbplyr . However, as it says here , and also shown below, dbplyr::spread() does not work. My question here is whether there is another way to accomplish what tidyr::spread does while working with tbl_dbi and tbl_sql data without downloading to local memory. Using sample data from here , below I present what I get and what I would like to do and get. #sample tbl_dbi and tbl_sql data df_sample <- tribble(~group1,