sqldf

sqldf, csv, and fields containing commas

不问归期 提交于 2019-12-30 10:33:27
问题 Took me a while to figure this out. So, I am answering my own question. You have some .csv, you want to load it fast, you want to use the sqldf package. Your usual code is irritated by a few annoying fields. Example: 1001, Amy,9:43:00, 99.2 1002,"Ben,Jr",9:43:00, 99.2 1003,"Ben,Sr",9:44:00, 99.3 This code only works on *nix systems. library(sqldf) system("touch temp.csv") system("echo '1001, Amy,9:43:00, 99.2\n1002,\"Ben,Jr\",9:43:00, 99.2\n1003,\"Ben,Sr\",9:44:00, 99.3' > temp.csv") If try

sqldf, csv, and fields containing commas

你离开我真会死。 提交于 2019-12-30 10:32:43
问题 Took me a while to figure this out. So, I am answering my own question. You have some .csv, you want to load it fast, you want to use the sqldf package. Your usual code is irritated by a few annoying fields. Example: 1001, Amy,9:43:00, 99.2 1002,"Ben,Jr",9:43:00, 99.2 1003,"Ben,Sr",9:44:00, 99.3 This code only works on *nix systems. library(sqldf) system("touch temp.csv") system("echo '1001, Amy,9:43:00, 99.2\n1002,\"Ben,Jr\",9:43:00, 99.2\n1003,\"Ben,Sr\",9:44:00, 99.3' > temp.csv") If try

Using read.csv.sql to select multiple values from a single column

你离开我真会死。 提交于 2019-12-29 07:17:14
问题 I am using read.csv.sql from the package sqldf to try and read in a subset of rows, where the subset selects from multiple values - these values are stored in another vector. I have hacked a way to a form that works but I would like to see the correct way to pass the sql statement. Code below gives minimum example. library(sqldf) # some data write.csv(mtcars, "mtcars.csv", quote = FALSE, row.names = FALSE) # values to select from variable 'carb' cc <- c(1, 2) # This only selects last value

Regarding sqldf package/regexp function [duplicate]

岁酱吖の 提交于 2019-12-29 02:07:12
问题 This question already has answers here : How do I use regex in a SQLite query? (16 answers) Closed 3 years ago . I am using sqldf package and sql analyze one table generated by a classification model. But when I use the code: table<-sqldf(" SELECT a, b, c, d, e, f, CASE WHEN (REGEXP_LIKE(t, '\b(2nd time|3rd time|4th time)\b')) = TRUE THEN 1 ELSE 0 END AS UPSET_NOT_LIKE, regexp_extract(t, '\b(2nd time|3rd time|4th time)\b')) as Word FROM cls ") It looks like that the sqldf package don't have

Can I use two character vectors in a sqldf join statement?

家住魔仙堡 提交于 2019-12-25 07:59:30
问题 I am conducting a sqldf join of 3 different data.tables. My current working code looks like this: AltSuitRaw <- data.table(sqldf('select RealAlt.*, SpdSpSuit * SpdSpT as SpdSpSuitT, SpdIncSuit * SpdIncT as SpdIncSuitT, SpdGrowSuit * SpdGrT as SpdGrowSuitT, RzbSpSuit * RzbSpT as RzbSpSuitT, RzbIncSuit * RzbIncT as RzbIncSuitT, RzbGrowSuit * RzbGrT as RzbGrowSuitT, FMSSpSuit * FmsSpT as FmsSpSuitT, FMSIncSuit * FmsIncT as FmsIncSuitT, FMSGrowSuit * FMSGrT as FmsGrowSuitT, BhsSpSuit * BhsSpT as

How do you explicitly delete a SQLite database created with the sqldf library in an R script

故事扮演 提交于 2019-12-25 05:59:29
问题 I have created an R function to perform subsetting, summaries, densities, and plotting. I was initially assigning out the subsets to my workspace in RStudio but I started running into memory constraints. The latest revision is attempting to store the summarized observation counts in a SQLite database vs. exporting the subsets as their own dataframes. The theory was that this would utilize less memory. In order to perform this process I created a new database in my function as in: sqldf(

R, issue with sqldf: cannot make condition on date

隐身守侯 提交于 2019-12-25 04:48:08
问题 I have a R dataframe with a field date (type date), i want to query this dataframe using sqldf library, but the where condition doesn't seem to work on the date field. The query I'm using is: sqldf("select * from elog where date >= '1997-01-01' limit 6") It returns me an empty dataframe even though 'elog' has lines having 1997-01-01 as date 回答1: You could try the same command after loading library(RH2) library(RH2) library(sqldf) sqldf("select * from elog where date >= '1997-01-01' limit 6")

Dealing with commas in a CSV file in sqldf

人走茶凉 提交于 2019-12-25 00:21:42
问题 I am following up on my question here sqldf returns zero observations with a reproducible example. I found that the problem is probably from the "comma" in one of the cells ("1,500+") and I think that I have to use a filter as suggested here sqldf, csv, and fields containing commas, but I am not sure how to define my filter. Below is the code: library(sqldf) df <- data.frame("a" = c("8600000US01770" , "8600000US01937"), "b"= c("1,500+" , "-"), "c"= c("***" , "**"), "d"= c("(x)" , "(x)"), "e"=

Using the sqldf library from R to write a SELECT statement

心已入冬 提交于 2019-12-24 18:41:43
问题 I have the data: library(earth) data(etitanic) I also need to use the library library(sqldf) My goal is to write a SELECT statement that returns the survival rates by gender. My statement must include the etitanic data frame (treated like a database table). I do not know SQL very well but from my understanding I have to write something like SELECT survival, gender FROM etitanic I am not sure how to achieve this in R, any suggestions would be helpful. I tried the following: df = sqldf('select

How to optimise filtering and counting for every row in a large R data frame

家住魔仙堡 提交于 2019-12-24 13:23:04
问题 I have a data frame, such as the following: name day wages 1 Ann 1 100 2 Ann 1 150 3 Ann 2 200 4 Ann 3 150 5 Bob 1 100 6 Bob 1 200 7 Bob 1 150 8 Bob 2 100 For every unique name/day pair, I would like to calculate a range of totals, such as 'number of times wages was greater than 175 on current or next day for this person'. There are many more columns than wages and there are four time-slices to be applied to each total for each row. I can currently accomplish by unique'ing my data frame: df