dplyr

mapping (ordered) factors to colors in ggplot

冷暖自知 提交于 2021-02-20 04:04:12
问题 Consider this example data_frame(mylabel = c('month 18', 'month 19', 'month 20', 'month 21', 'month 22'), value = c(5,10,-2,2,0), time = c(1,2,3,4,5)) %>% ggplot(aes( x= time, y = value, color = mylabel)) + geom_point(size = 7) Here you can see that the variable mylabel has a natural ordering: month 18 comes before month 19 etc. However, this natural ordering is not preserved by the colors chosen by ggplot . In my real dataset, I have about 50 different months and I would like to use a color

Split a dataframe into a list of nested data frames and matrices

ε祈祈猫儿з 提交于 2021-02-20 03:46:53
问题 I'd like to split the diamonds data frame into a list of 5 dataframe, group by cut . This instruction got me started. https://dplyr.tidyverse.org/reference/group_split.html diamonds_g <- diamonds%>% group_split(cut)%>% setNames(unique(diamonds$cut)) My desired output is a list of 5 nested lists. Each nested list contains one data frame and one matrix, such that: View(diamonds_g[[1]]) factors <- diamonds_g[[1]][2:4] mat <- diamonds_g[[1]][6:10] So each of the nested list (or each cut )

R Error: First argument, `data`, must be a data frame or shared data

社会主义新天地 提交于 2021-02-20 02:53:52
问题 I am using the R programming language. I am following this tutorial over here: https://plotly.com/r/dropdowns/ I tried to create my own data and run the same procedure: library(plotly) library(MASS) library(dplyr) # create data x <- sample( LETTERS[1:4], 731, replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) ) y <- rnorm(731,10,10) z <- rnorm(731,5,5) date= seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day") df <- data.frame(x,y, z, date) df$x = as.factor(df$x) #create plot fig <- plot_ly(df, x

R Error: First argument, `data`, must be a data frame or shared data

时光总嘲笑我的痴心妄想 提交于 2021-02-20 02:52:19
问题 I am using the R programming language. I am following this tutorial over here: https://plotly.com/r/dropdowns/ I tried to create my own data and run the same procedure: library(plotly) library(MASS) library(dplyr) # create data x <- sample( LETTERS[1:4], 731, replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) ) y <- rnorm(731,10,10) z <- rnorm(731,5,5) date= seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day") df <- data.frame(x,y, z, date) df$x = as.factor(df$x) #create plot fig <- plot_ly(df, x

R Error: First argument, `data`, must be a data frame or shared data

一曲冷凌霜 提交于 2021-02-20 02:52:15
问题 I am using the R programming language. I am following this tutorial over here: https://plotly.com/r/dropdowns/ I tried to create my own data and run the same procedure: library(plotly) library(MASS) library(dplyr) # create data x <- sample( LETTERS[1:4], 731, replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) ) y <- rnorm(731,10,10) z <- rnorm(731,5,5) date= seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day") df <- data.frame(x,y, z, date) df$x = as.factor(df$x) #create plot fig <- plot_ly(df, x

Using dplyr to create new dataframe depending on thresholds

两盒软妹~` 提交于 2021-02-19 08:57:20
问题 Groups Names COL1 COL2 COL3 COL4 1 G1 SP1 1 0.400 0.500 Sequence1 2 G1 SP1 1 0.004 0.005 Sequence2 3 G1 SP1 0 0.004 0.005 Sequence3 4 G1 SP2 0 0.400 0.005 Sequence123 5 G1 SP2 0 0.004 0.500 Sequence14 6 G1 SP3 0 0.005 0.006 Sequence15 7 G1 SP5 1 0.400 0.006 Sequence16 8 G1 SP6 1 0.008 0.002 Sequence20 10 G2 Sp1 0 0.004 0.005 Sequence17 11 G2 SP1 0 0.050 0.600 Sequence18 12 G2 SP1 0 0.400 0.600 Sequence3 13 G2 SP2 0 0.004 0.005 Sequence22 14 G2 SP2 0 0.004 0.005 Sequence23 15 G2 SP5 0 0.004 0

Using dplyr to create new dataframe depending on thresholds

六月ゝ 毕业季﹏ 提交于 2021-02-19 08:57:11
问题 Groups Names COL1 COL2 COL3 COL4 1 G1 SP1 1 0.400 0.500 Sequence1 2 G1 SP1 1 0.004 0.005 Sequence2 3 G1 SP1 0 0.004 0.005 Sequence3 4 G1 SP2 0 0.400 0.005 Sequence123 5 G1 SP2 0 0.004 0.500 Sequence14 6 G1 SP3 0 0.005 0.006 Sequence15 7 G1 SP5 1 0.400 0.006 Sequence16 8 G1 SP6 1 0.008 0.002 Sequence20 10 G2 Sp1 0 0.004 0.005 Sequence17 11 G2 SP1 0 0.050 0.600 Sequence18 12 G2 SP1 0 0.400 0.600 Sequence3 13 G2 SP2 0 0.004 0.005 Sequence22 14 G2 SP2 0 0.004 0.005 Sequence23 15 G2 SP5 0 0.004 0

filter one dataframe via conditions in another

生来就可爱ヽ(ⅴ<●) 提交于 2021-02-19 08:55:24
问题 I want to recursively filter a dataframe, d by an arbitrary number of conditions (represented as rows in another dataframe z ). I begin with a dataframe d : d <- data.frame(x = 1:10, y = letters[1:10]) The second dataframe z , has columns x1 and x2 , which are lower and upper limits to filter d$x . This dataframe z may grow to be an arbitrary number of rows long. z <- data.frame(x1 = c(1,3,8), x2 = c(1,4,10)) I want to return all rows of d for which d$x <= z$x1[i] and d$x >= z$x2[i] for all i

Adding zero valued entries so that all groups have entries for the same items

早过忘川 提交于 2021-02-19 06:47:07
问题 I'm trying to use Rcharts to create a stacked bar chart across a number of recorded regions (stacking separate group values on top of each other). The data is in a format similar to below. Region | Group | Value ---------------------- USA | A | 5 USA | B | 3 USA | C | 1 UK | A | 4 UK | B | 6 France | C | 3 Using the below code produces a grouped bar chart which works fine. However the stacked button does nothing to change the plot. nPlot(Value ~ Region, group = 'Group', data = example_data,

using `rlang` quasiquotation with `dplyr::_join` functions

梦想与她 提交于 2021-02-19 05:45:26
问题 I am trying to write a custom function where I use rlang 's quasiquotation. This function also internally uses dplyr 's join functions. I have provided below a minimal working example that illustrated my problem. # needed libraries library(tidyverse) # function definition df_combiner <- function(data, x, group.by) { # check how many variables were entered for this grouping variable group.by <- as.list(rlang::quo_squash(rlang::enquo(group.by))) # based on number of arguments, select `group.by`