tidyr | 易学教程

tidyr: using mutate inside a function

阅读更多关于 tidyr: using mutate inside a function

问题 I'd like to use mutate function from the tidyverse to create a new column based on the old column using only a data frame and strings, which represent column headers, as inputs. I can get this to work without using the tidyverse (see function f below), but I'd like to get it to work using the tidyverse (see function f.tidy below) Can someone please post a solution for adding this column using mutate called from a inside function? df <- data.frame('test' = 1:3, 'tcy' = 4:6) # test tcy # 1 4 #

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

wide to long multiple columns issue

阅读更多关于 wide to long multiple columns issue

问题 I have something like this: id role1 Approved by Role1 role2 Approved by Role2 1 Amy 1/1/2019 David 4/4/2019 2 Bob 2/2/2019 Sara 5/5/2019 3 Adam 3/3/2019 Rachel 6/6/2019 I want something like this: id Name Role Approved 1 Amy role1 1/1/2019 2 Bob role1 2/2/2019 3 Adam role1 3/3/2019 1 David role2 4/4/2019 2 Sara role2 5/5/2019 3 Rachel role2 6/6/2019 I thought something like this would work melt(df,id.vars= id, measure.vars= list(c("role1", "role2"),c("Approved by Role1", "Approved by Role2")

R Question - Trying to use separate to split data with a non-constant delimiter

阅读更多关于 R Question - Trying to use separate to split data with a non-constant delimiter

问题 One of the variables is participant age groups, an example of one of the records is shown below, 0::Adult 18+||1:: Adult 18+||2::Adult 18+||3::Child 0-11 How do you best split this out so that it will give Adult 18 + with the result of 3 and Child 0-11 with 1? I tried using separate, but as the delimiter is not constant, it was omitting a lot of the records. Any suggestions would be helpful, thank you! As this is my first post, let me know if I need to add more information. 回答1: Here is one

R: Separating out a mixed data column, date above multiple times

阅读更多关于 R: Separating out a mixed data column, date above multiple times

问题 I have a situation where I have a data.frame where a vector has the date above a sequence of times, and I'd like to convert into some kind of POSIX date-time field. For example: "7/16/2014", "5:06:59 PM", "11:51:26 AM", "7/13/2014", "3:53:16 PM", "3:24:19 PM", "11:47:49 AM", "7/12/2014", "11:57:41 AM", "7/11/2014", "10:01:48 AM", "7/10/2014", "4:54:08 PM", "2:23:04 PM", "11:34:09 AM" Conceptually, it seems what to do is to replicate this MIXED vector into a DATEONLY vector and a TIMEONLY

importing data from MATLAB to R: nested structures into dataframes

阅读更多关于 importing data from MATLAB to R: nested structures into dataframes

问题 I have a MATLAB nested structure in a .mat file that I can read using readMat from the R package R.matlab . The ouput of readMat is a list. My question is whether there is any standard general approach that can be applied to convert this type of lists into expanded data.frames. Example: MATLAB code to create the nested structure with fields: s(1).field1(1).subfield1 = rand(3,1) s(1).field1(2).subfield1 = rand(3,1) s(1).field1(1).subfield2 = rand(3,1) s(1).field1(2).subfield2 = rand(3,1) s(2)

Create multiple columns from a single column and clean up results

阅读更多关于 Create multiple columns from a single column and clean up results

问题 I have a data frame like this: foo=data.frame(Point.Type = c("Zero Start","Zero Start", "Zero Start", "3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","Zero Stop","Zero Start"), Point.Value = c(NA,NA,NA,rnorm(3),NA,NA)) I want to add three columns, by splitting the first column with separator _ , and retain only the numeric values obtained after the split. For those rows where the first column doesn't contain any _ , the

Manipulating variables to produce a new dataset in R

阅读更多关于 Manipulating variables to produce a new dataset in R

问题 I'm a relatively new R user. I would really appreciate any help with my dataset please. I have a dataset with 24 million rows. There are 3 variables in the dataset: patient name, pharmacy name, and count of medications picked up from the pharmacy at that visit. Some patients appear in the dataset more than once (ie. they have picked up medications from different pharmacies at different time points). The data frame looks like this: df <- data.frame(name = c("Tom", "Rob", "Tom", "Tom", "Amy"),