R: Best Practices - dplyr and odbc multi table actions (retrieved from SQL)

独自空忆成欢 提交于 2020-05-13 06:59:45

问题


Say you have your tables stores in an SQL server DB, and you want to perform multi table actions, i.e. join several tables from that same database.

Following code can interact and receive data from SQL server:

library(dplyr)
library(odbc)
con <- dbConnect(odbc::odbc(),
                 .connection_string = "Driver={SQL Server};Server=.;Database=My_DB;")

Table1 <- tbl(con, "Table1")
Table1 # View glimpse of Table1

Table2 <- tbl(con, "Table2")
Table2 # View glimpse of Table2

Table3 <- tbl(con, "Table3")

However, with a few results retrieved with the same connection, eventually following error occurs:

Error: [Microsoft][ODBC SQL Server Driver]Connection is busy with results for another hstmt

My current googling skills have taking me to the answer that the backend does not support multiple active result sets (MARS) - I guess more than 2 active result sets is the maximum? (backend is DBI and odbc)

So, my question is: what is best practice if I want to collect data from several tables from an SQL DB?

  • Open a connection for each table?

  • Actively close the connection and open it again for the next table?

  • Does the backend support MARS to be parsed to the connection string?


回答1:


To make a connection that can hold multiple result sets, I've had luck with following connection code:

con <- DBI::dbConnect(odbc::odbc(),
                      Driver = "SQL Server Native Client 11.0",
                      Server = "my_host",
                      UID = rstudioapi::askForPassword("Database UID"),
                      PWD = rstudioapi::askForPassword("Database PWD"),
                      Port = 1433,
                      MultipleActiveResultSets = "True",
                      Database = my_db)

On top of that, I found that the new pool-package can do the job:

pool <- dbPool(odbc::odbc(),
                      Driver = "SQL Server Native Client 11.0",
                      Server = "my_host",
                      UID = rstudioapi::askForPassword("Database UID"),
                      PWD = rstudioapi::askForPassword("Database PWD"),
                      Port = 1433,
                      MultipleActiveResultSets = "True",
                      Database = my_db)

It is quicker and more stable than the DBI connection, however, one minor drawback is that the database doesn't pop up in the connection tab for easy reference.

For both methods, remember to close the connection/pool when done. For the DBI-method its:

dbDisconnect(con)

Whereas the pool-method is closed by calling:

poolClose(pool)


来源:https://stackoverflow.com/questions/48417212/r-best-practices-dplyr-and-odbc-multi-table-actions-retrieved-from-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!