tidyr

tidyr: using mutate inside a function

落花浮王杯 提交于 2020-01-23 06:47:05
问题 I'd like to use mutate function from the tidyverse to create a new column based on the old column using only a data frame and strings, which represent column headers, as inputs. I can get this to work without using the tidyverse (see function f below), but I'd like to get it to work using the tidyverse (see function f.tidy below) Can someone please post a solution for adding this column using mutate called from a inside function? df <- data.frame('test' = 1:3, 'tcy' = 4:6) # test tcy # 1 4 #

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

佐手、 提交于 2020-01-21 10:09:12
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

六月ゝ 毕业季﹏ 提交于 2020-01-21 10:09:07
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

一世执手 提交于 2020-01-21 10:09:05
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

wide to long multiple columns issue

佐手、 提交于 2020-01-16 09:48:16
问题 I have something like this: id role1 Approved by Role1 role2 Approved by Role2 1 Amy 1/1/2019 David 4/4/2019 2 Bob 2/2/2019 Sara 5/5/2019 3 Adam 3/3/2019 Rachel 6/6/2019 I want something like this: id Name Role Approved 1 Amy role1 1/1/2019 2 Bob role1 2/2/2019 3 Adam role1 3/3/2019 1 David role2 4/4/2019 2 Sara role2 5/5/2019 3 Rachel role2 6/6/2019 I thought something like this would work melt(df,id.vars= id, measure.vars= list(c("role1", "role2"),c("Approved by Role1", "Approved by Role2")

R Question - Trying to use separate to split data with a non-constant delimiter

流过昼夜 提交于 2020-01-16 09:02:28
问题 One of the variables is participant age groups, an example of one of the records is shown below, 0::Adult 18+||1:: Adult 18+||2::Adult 18+||3::Child 0-11 How do you best split this out so that it will give Adult 18 + with the result of 3 and Child 0-11 with 1? I tried using separate, but as the delimiter is not constant, it was omitting a lot of the records. Any suggestions would be helpful, thank you! As this is my first post, let me know if I need to add more information. 回答1: Here is one

R: Separating out a mixed data column, date above multiple times

牧云@^-^@ 提交于 2020-01-16 08:54:32
问题 I have a situation where I have a data.frame where a vector has the date above a sequence of times, and I'd like to convert into some kind of POSIX date-time field. For example: "7/16/2014", "5:06:59 PM", "11:51:26 AM", "7/13/2014", "3:53:16 PM", "3:24:19 PM", "11:47:49 AM", "7/12/2014", "11:57:41 AM", "7/11/2014", "10:01:48 AM", "7/10/2014", "4:54:08 PM", "2:23:04 PM", "11:34:09 AM" Conceptually, it seems what to do is to replicate this MIXED vector into a DATEONLY vector and a TIMEONLY

importing data from MATLAB to R: nested structures into dataframes

坚强是说给别人听的谎言 提交于 2020-01-15 07:49:08
问题 I have a MATLAB nested structure in a .mat file that I can read using readMat from the R package R.matlab . The ouput of readMat is a list. My question is whether there is any standard general approach that can be applied to convert this type of lists into expanded data.frames. Example: MATLAB code to create the nested structure with fields: s(1).field1(1).subfield1 = rand(3,1) s(1).field1(2).subfield1 = rand(3,1) s(1).field1(1).subfield2 = rand(3,1) s(1).field1(2).subfield2 = rand(3,1) s(2)

Create multiple columns from a single column and clean up results

老子叫甜甜 提交于 2020-01-15 04:59:21
问题 I have a data frame like this: foo=data.frame(Point.Type = c("Zero Start","Zero Start", "Zero Start", "3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","3000rpm_10%_13barG_Sdsdsa_1.0_ss_Pww","Zero Stop","Zero Start"), Point.Value = c(NA,NA,NA,rnorm(3),NA,NA)) I want to add three columns, by splitting the first column with separator _ , and retain only the numeric values obtained after the split. For those rows where the first column doesn't contain any _ , the

Manipulating variables to produce a new dataset in R

╄→尐↘猪︶ㄣ 提交于 2020-01-15 03:44:24
问题 I'm a relatively new R user. I would really appreciate any help with my dataset please. I have a dataset with 24 million rows. There are 3 variables in the dataset: patient name, pharmacy name, and count of medications picked up from the pharmacy at that visit. Some patients appear in the dataset more than once (ie. they have picked up medications from different pharmacies at different time points). The data frame looks like this: df <- data.frame(name = c("Tom", "Rob", "Tom", "Tom", "Amy"),