tidyr | 易学教程

ggplot2 legend for plot combining geom_bar and geom_point

阅读更多关于 ggplot2 legend for plot combining geom_bar and geom_point

问题 I am trying to make a plot to show the returns of various securities in a portfolio in a bar plot and then superimpose points over the bars indicating exposure to those securities. However, the legend I get completely ignores the points and only draws a legend for the bars. To produce a dataframe with similar structure: out<-data.frame(security=c("A", "B", "C", "D", "A", "B", "C", "D"), avg_weight=c(0.1,0.2,0.3,0.4, 0.1, 0.2, 0.3, 0.4), return_type=c(rep("systematic",4), rep("idiosyncratic",4

Reshape messy longitudinal survey data containing multiple different variables, wide to long

阅读更多关于 Reshape messy longitudinal survey data containing multiple different variables, wide to long

问题 I hope that I'm not recreating the wheel, and do not think that the following can be answered using reshape . I have messy longitudinal survey data, that I want to convert from wide to long format. By messy I mean: I have a mixture of variable types (numeric, factor, logical) Not all variables have been collected at every timepoint. For example: data <- read.table(header=T, text=' id inlove.1 inlove.2 income.2 income.3 mood.1 mood.3 random 1 TRUE FALSE 87717.76 82281.25 happy happy filler 2

Joining the result of two statistical tables in one table in R

阅读更多关于 Joining the result of two statistical tables in one table in R

问题 In continuation of this issue comparison Mann-Whitney test between groups, I decided to create a new topic. Solution of Rui Barradas helped me calculate Mann-Whitney for group 1-2 and 1-3. lst <- split(mydat, mydat$group) lapply(lst[-1], function(DF) wilcox.test(DF$var, lst[[1]]$var, exact = FALSE)) So now i want get the descriptive statistics. I use library:psych describeBy(mydat$var,mydat$group) So i get the following output group: 1 vars n mean sd median trimmed mad min max range skew

Tidy up and reshape messy dataset (reshape/gather/unite function)?

阅读更多关于 Tidy up and reshape messy dataset (reshape/gather/unite function)?

问题 Following my earlier question: R: reshape/gather function to create dataset ready for multilevel analysis I discovered it is a bit more complicated. My dataset is actually 'messier' than I hoped. So here's the full story: I have a big dataset, 240 cases. Each row is a case (breast cancer patient). Somewhere at the end of the dataset(say from column 417 onwards) I have partner data of the patients, that also filled in a questionnaire. In the beginning, there are demographic variables for both

Pivot wider produces nested object

阅读更多关于 Pivot wider produces nested object

问题 This is regarding latest tidyr release. I am trying pivot_wider & pivot_longer function from library(tidyr) (Update 1.0.0) I was trying to obtain normal iris dataset when I run below but instead I get nested sort of 3X5 dimension tibble, not sure whats happening (I read https://tidyr.tidyverse.org/articles/pivot.html) but still not sure how to avoid this library(tidyr) iris %>% pivot_longer(-Species,values_to = "count") %>% pivot_wider(names_from = name, values_from = count) Expected Output:

from wide format to long format with results in multiple columns [duplicate]

阅读更多关于 from wide format to long format with results in multiple columns [duplicate]

问题 This question already has answers here : Combine Multiple Columns Into Tidy Data [duplicate] (3 answers) Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed 2 years ago . I have a data that looks like the following dataframe, but every combo has about ten fields, starting with name1, adress1, city1, etc id name1 adress1 name2 adress2 name3 adress3 1 1 John street a Burt street d chris street 1 2 2 Jack street b Ben street e connor

tidyr::gather multiple columns of varying types

阅读更多关于 tidyr::gather multiple columns of varying types

问题 My question is similar to this question. I'm trying to tidyr::gather multiple columns. However, the solution provided in the link is less than ideal because the attributes are generally not identical across all columns and so they are dropped. Note, I know how to do this with base R, but I'm trying to learn how to do the equivalent operation with tidyr and/or dplyr. Below I've simulated some data (poorly, but quickly) that illustrate the situation I often find myself in (although I generally

Splitting rows with uneven string length into columns in R using tidyr [duplicate]

阅读更多关于 Splitting rows with uneven string length into columns in R using tidyr [duplicate]

问题 This question already has answers here : Split data frame string column into multiple columns (14 answers) Closed 3 years ago . Edit: This was marked as a duplicate. It is not. The question here is not only about splitting a single column into multiple ones, as my separate code would had worked. The main point of my question is splitting the column when the row string possess varying lengths of column output. I'm trying to turn this: data <- c("Place1-Place2-Place2-Place4-Place2-Place3-Place5

Preserve order of input variables and factor levels in summary table, using dplyr tidyr

阅读更多关于 Preserve order of input variables and factor levels in summary table, using dplyr tidyr

问题 I love how easy dplyr and tidyr have made it to create a single summary table with multiple predictor and outcome variables. One thing that got me stumped was the final step of preserving/defining the order of the predictor variables, and their factor levels, in the output table. I've come up with a solution of sorts (below), which involves using mutate to manually make a factor variable that combines both the predictor and predictor value (eg. "gender_female") with levels in the desired

Separate a column into 2 columns at the last underscore in R

阅读更多关于 Separate a column into 2 columns at the last underscore in R

问题 I have a dataframe like this id <-c("1","2","3") col <- c("CHB_len_SCM_max","CHB_brf_SCM_min","CHB_PROC_S_SV_mean") df <- data.frame(id,col) I want to create 2 columns by separating the "col" into the measurement and stat. stat is basically the text after the last underscore (max,min,mean, etc) My desired output is id Measurement stat 1 CHB_len_SCM max 2 CHB_brf_SCM min 3 CHB_PROC_S_SV mean I tried it this way but the stat column in empty. I am not sure if I am pointing to the last underscore