r | 易学教程

Can you use Athena ODBC/JDBC to return the S3 location of results?

阅读更多关于 Can you use Athena ODBC/JDBC to return the S3 location of results?

问题 I've been using the metis package to run Athena queries via R. While this is great for small queries, there still does not seem to be a viable solution for queries with very large return datasets (10's of thousands of rows, for example). However, when running these same queries in the AWS console, it is fast/straightforward to use the download link to obtain the CSV file of the query result. This got me thinking: is there a mechanism for sending the query via R but returning/obtaining the S3:

Coloring the points by category in R

阅读更多关于 Coloring the points by category in R

问题 I am creating a scatter plot in R using the following code: plot(df_prob1$x1, df_prob1$x2, pch = df_prob1$y) I get the following plot: As seen in the above plot there are two categories, one represented by a square and the other by circle. I want these two categories to have different colors as well. I did try using the following code: plot(df_prob1$x1, df_prob1$x2, pch = df_prob1$y, col = c("red", "blue")) And I get the following plot: However, it is randomly coloring points and not taking

How to load a git branch from another R package

阅读更多关于 How to load a git branch from another R package

问题 In R, how I load one package's git branch from another package? There are two packages, call them producer and consumer1 . I am refactoring my code by moving a bunch of function definitions and tests from producer to consumer1 . I'm creating git branches, rfctrProd and rfctrCons1 for producer and consumer1 . In rfctrCons1 , I need a statement doing something like #` @import producer, gitBranch = rfctrProd Also, I'll to do similarly with other packages which import producer , to make sure I

R, inconsistent date format

阅读更多关于 R, inconsistent date format

问题 I have a date variable, which originally comes from an excel. However, it is so heterogeneous. Even though all look like yyyy/mm/dd in the excel, when read in R, the variable look like: person_1 39257 person_2 2015/2/20 person_3 NA How to clean up the date variable so that every and each shows yyyy/mm/dd format? 回答1: Or an option with anydate and excel_numeric_to_date library(janitor) library(anytime) library(dplyr) coalesce( excel_numeric_to_date(as.numeric(dat$V2)), anydate(dat$V2)) #[1]

How do I create a function that defines a moving threshold along local maxima in R?

阅读更多关于 How do I create a function that defines a moving threshold along local maxima in R?

问题 The goal is to quantify a certain growth. The definition is as follows: Every value in the sequence shall be compared to the preceding value and if the following value is greater than the preceding one, it shall be taken into regard (returned). If not, it shall be dropped. Consequently, the greater value is used as a new reference for the following ones. A threshold that moves with the ascending values. I've tried this: growthdata<-c(21679, 21722, 21788, 21863, 21833, 21818, 21809, 21834,

How to make a plot in r with multiple lines using ggplot

阅读更多关于 How to make a plot in r with multiple lines using ggplot

问题 I am trying to do a graph in r with 3 lines using ggplot, but the third line does not appear in the graph. I used the following code: us_idlpnts <- subset(unvoting, CountryName == "United States of America") rus_idlpnts <- subset(unvoting, CountryName == "Russia") mdn_idl_pnt <- summarize(unvoting, PctAgreeUS = median(PctAgreeUS, na.rm=T), PctAgreeRUSSIA = median(PctAgreeRUSSIA, na.rm=T), idealpoint = median(idealpoint, na.rm=T), Year = median(Year, na.rm= T)) ggplot(NULL, aes(Year,

How to make a plot in r with multiple lines using ggplot

阅读更多关于 How to make a plot in r with multiple lines using ggplot

Regular Expression R: Select the above or below lines of a regexp selection while meeting another regexp criteria

阅读更多关于 Regular Expression R: Select the above or below lines of a regexp selection while meeting another regexp criteria

问题 I am working with a text document similar to the examples below. File <- c("Location Name Code and Label Frequency Percentage", " During the past 30 days, on how many days did you carry a weapon", "44-44 Q13 such as a gun, knife, or club on school property?", " 1 0 days 1,610 94.5", " 2 1 day 71 4.3", " 3 2 or 3 days 6 0.4", " 4 4 or 5 days 3 0.2", " 5 6 or more days 12 0.7", " Missing 48", "45-45 Q14 During the past 12 months, on how many days did you carry a gun?", " 1 0 days 1,602 91.3", "

Regular Expression R: Select the above or below lines of a regexp selection while meeting another regexp criteria

阅读更多关于 Regular Expression R: Select the above or below lines of a regexp selection while meeting another regexp criteria

How to use a loop to delete all rows with negative values in R

阅读更多关于 How to use a loop to delete all rows with negative values in R

问题 I am new to loops. I have an unwieldy data frame that I want to cut down so that only observations (rows) without negative numbers remain. Here is where I'm stuck. This creates a null value every time instead of a trimmed down data frame. mydata=for (i in names(df)) { subset(df, df[[ paste(i)]]>=0) } 回答1: How about a purely vectorised solution: DF[!rowSums(DF < 0), ] # ID Items Sequence #1 1 D 1 #2 1 A 2 #5 2 B 2 Data DF=structure(list(ID = c(1, 1, 1, -1, 2), Items = c("D", "A", "A", "A", "B"