plyr

How to create conditional dummies “before the event” with dplyr in R?

删除回忆录丶 提交于 2019-12-10 10:55:33
问题 I'm trying create a condition dummy (X) with the rule set X=1 if Y=1 the last two years before the NA (only count once!). To give an example: this is a sample from my data: year country Y 1990 Bahamas 1 1991 Bahamas NA 1992 Bahamas NA 1993 Bahamas 0 1994 Bahamas 1 1995 Bahamas 1 1996 Bahamas NA 1997 Bahamas 1 1998 Bahamas NA 1999 Bahamas 1 2000 Bahamas NA 2001 Bahamas 1 2002 Bahamas 1 2003 Bahamas 0 2004 Bahamas NA 2005 Bahamas 0 2006 Bahamas 0 2007 Bahamas 1 2008 Bahamas NA 2009 Bahamas 1

make a list of lm objects, retain their class

亡梦爱人 提交于 2019-12-10 10:15:52
问题 Apologies for such a rudimentary question--I must be missing something obvious. I want to build a list of lm objects, which I'm then going to use in an llply call to perform mediation analysis on this list. But this is immaterial--I just first want to make a list of length m (where m is the set of models) and each element within m will itself contain n lm objects. So in this simple example d1 <- data.frame(x1 = runif(100, 0, 1), x2 = runif(100, 0, 1), x3 = runif(100, 0, 1), y1 = runif(100, 0,

Concatenate values by group in descending order [duplicate]

爱⌒轻易说出口 提交于 2019-12-10 09:56:48
问题 This question already has answers here : Collapse / concatenate / aggregate a column to a single comma separated string within each group (3 answers) Closed 2 years ago . I want a data.My data A looks like author_id paper_id prob 731 24943 1 731 24943 1 731 688974 1 731 964345 .8 731 1201905 .9 731 1267992 1 736 249 .2 736 6889 1 736 94345 .7 736 1201905 .9 736 126992 .8 The output I am desiring is: author_id paper_id 731 24943,24943,688974,1201905,964345 736 6889,1201945,126992,94345,249

R ttest inside ddply gives error “grouping factor must have exactly 2 levels”

孤街浪徒 提交于 2019-12-10 09:40:59
问题 I have a dataframe with several factors and two phenotypes freq sampleID status score snpsincluded 0.5 0001 case 100 all 0.2 0001 case 30 all 0.5 0002 control 110 all 0.5 0003 case 100 del etc I would like to do a t.test comparing cases and controls for each set of relevant factors. I have tried the following: o2 <- ddply(df, c("freq","snpsincluded"), summarise, pval=t.test(score, status)$p.value) but it complains that " grouping factor must have exactly 2 levels" I have no missing values,

How do I make doSMP play nicely with plyr?

不羁岁月 提交于 2019-12-10 09:31:54
问题 This code works: library(plyr) x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5) ddply(x, .(V), function(df) sum(df$Z),.parallel=FALSE) While this code fails: library(doSMP) workers <- startWorkers(2) registerDoSMP(workers) x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5) ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) stopWorkers(workers) >Error in do.ply(i) : task 3 failed - "subscript out of bounds" In addition: Warning messages: 1: <anonymous>: ... may be used in an

lm called from inside dlply throws “0 (non-NA) cases” error [r]

为君一笑 提交于 2019-12-10 03:11:26
问题 I'm using dlply() with a custom function that averages slopes of lm() fits on data that contain some NA values, and I get the error "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases" This error only happens when I call dlply with two key variables - separating by one variable works fine. Annoyingly I can't reproduce the error with a simple dataset, so I've posted the problem dataset in my dropbox. Here's the code, as minimized as possible while still

Error: withCallingHandlers crashing R

て烟熏妆下的殇ゞ 提交于 2019-12-10 00:55:21
问题 I've been using plyr-based function summarySE and ddply for several months without any problem. Today when I ran my extremely basic routine in R some error message showed up and made R crash. Here is an example code and the error I get before R crashes: install.packages("plyr") library(plyr) results<-data.frame(Depth=rbind("Surface","Bottom"),DO=(runif(10,4,6))) ddply(results, .(Depth), summarise, mean = round(mean(DO), 2), sd = round(sd(DO), 2), min = min(DO), max = max(DO)) Error in

How can I use ddply with varying .variables?

喜夏-厌秋 提交于 2019-12-09 15:39:58
问题 I use ddply to summarize some data.frame by various categories, like this: # with both group and size being factors / categorical split.df <- ddply(mydata,.(group,size),summarize, sumGroupSize = sum(someValue)) This works smoothly, but often I like to calculate ratios which implies that I need to divide by the group's total. How can I calculate such a total within the same ddply call? Let's say I'd like to have the share of observations in group A that are in size class 1. Obviously I have to

dplyr: apply function table() to each column of a data.frame

主宰稳场 提交于 2019-12-09 14:30:47
问题 Apply function table() to each column of a data.frame using dplyr I often apply the table-function on each column of a data frame using plyr , like this: library(plyr) ldply( mtcars, function(x) data.frame( table(x), prop.table( table(x) ) ) ) Is it possible to do this in dplyr also? My attempts fail: mtcars %>% do( table %>% data.frame() ) melt( mtcars ) %>% do( table %>% data.frame() ) 回答1: You can try the following which does not rely on the tidyr package. mtcars %>% lapply(table) %>%

Errors installing plyr / rcpp

China☆狼群 提交于 2019-12-09 11:32:48
问题 I have two computers and in one of them I can't manage to install the plyr package for R. This is the error I get: * installing *source* package ‘plyr’ ... ** package ‘plyr’ successfully unpacked and MD5 sums checked ** libs g++ -I/usr/share/R/include -DNDEBUG -I"/usr/lib/R/site-library/Rcpp/include" -fpic -O2 -pipe -g -c RcppExports.cpp -o RcppExports.o RcppExports.cpp: En la función ‘SEXPREC* plyr_loop_apply(SEXP, SEXP)’: RcppExports.cpp:15:9: error: ‘input_parameter’ no es un miembro de