plyr

How to get frequency count of date based on condition in R?

丶灬走出姿态 提交于 2019-12-08 05:42:27
问题 Below is my scenerio. Scenerio I have two dataframe. 1st dataframe contains data about system usage and another dataframe contains data about System location. I would like to track instrument usage based on date the system was used and also the location where the instrument is located. For this I am performing outer join on dataframes using dplyr library. Next, I would like to get frequency count of the systems based on date. For this I am using groupby on System and Locations. If the system

Need to label each geom_vline with the factors using a nested ddply function and facet wrap

浪尽此生 提交于 2019-12-08 05:06:44
问题 I cannot figure out how to label each vline with year. can someone help? Below is an example of my dataset and code. I would like to label the year on the mean length vline and/or have the same colour code as the year in the figure legend. Sector2 Family Year Length BUN Acroporidae 2010 332.1300496 BUN Poritidae 2011 141.1467966 BUN Acroporidae 2012 127.479 BUN Acroporidae 2013 142.5940556 MUR Faviidae 2010 304.0405 MUR Faviidae 2011 423.152 MUR Pocilloporidae 2012 576.0295 MUR Poritidae 2013

How to apply functions in columns for data frames with different sizes in nested list?

无人久伴 提交于 2019-12-08 04:55:42
问题 In R, to apply some function to a column, you can do: df$col <- someFunction(df$col) Now my question is, how do you the similar task when you have data frames in a nested list? Say I have a following list like this, where I have data frames in the second level from the root. +------+------+ type1 | id | name | +----------->|------|------| | | | | | | | | year1 | +------+------+ +------------------+ | | | | +------+------+-----+ | | type2 | meta1|meta2 | name| | +----------> |------|------|---

R: rollapplyr and lm factor error: Does rollapplyr change variable class?

狂风中的少年 提交于 2019-12-08 03:40:07
问题 This question builds upon a previous one which was nicely answered for me here. R: Grouped rolling window linear regression with rollapply and ddply Wouldn't you know that the code doesn't quite work when extended to the real data rather than the example data? I have a somewhat large dataset with the following characteristics. str(T0_satData_reduced) 'data.frame': 45537 obs. of 5 variables: $ date : POSIXct, format: "2014-11-17 08:47:35" "2014-11-17 08:47:36" "2014-11-17 08:47:37" ... $ trial

extracting p values from multiple linear regression (lm) inside of a ddply function using spatial data

房东的猫 提交于 2019-12-08 03:08:24
问题 I have a set of spatial coordinate (x,y) data that has a response variable for each coordinate over the course of several years. The following code generates a similar data frame: df <- data.frame( id = rep(1:2, 2), x = rep(c(25, 30),10), y = rep(c(100, 200), 10), year = rep(1980:1989, 2), response = rnorm(20) ) The resulting data frame: head(df) id x y year response 1 1 25 100 1980 0.1707431 2 2 30 200 1981 1.3562263 3 1 25 100 1982 -0.4590506 4 2 30 200 1983 1.3238410 5 1 25 100 1984 1

Produce a precision weighted average among rows with repeated observations

有些话、适合烂在心里 提交于 2019-12-08 01:57:05
问题 I have a dataframe similar to the one generated below. Some individuals have more than one observation for a particular variable and each variable has an associated standard error (SE) for the estimate. I would like to create a new dataframe that contains only a single row for each individual. For individuals with more than one observation, such as Kim or Bob, I need to calculate a precision weighted average based on the standard errors of the estimates along with a variance for the newly

Seasonal aggregate of monthly data

不问归期 提交于 2019-12-08 00:25:55
问题 I have dataframe df with x,y,and monthly.year data for each x,y point. I am trying to get the seasonal aggregate. I need to calculate seasonal means i.e. For winter mean of (December,January,February); for Spring mean of (March,April,May), for Summer mean of (June,July,August) and for autumn mean of (September,October,November). The data looks similar to: set.seed(1) df <- data.frame(x=1:3,y=1:3, matrix(rnorm(72),nrow=3) ) names(df)[3:26] <- paste(month.abb,rep(2009:2010,each=12),sep=".") x y

Compute one sample t-test for each column of a data frame and summarize results in a table

空扰寡人 提交于 2019-12-07 22:39:37
问题 Here is some sample data on my problem: mydf <- data.frame(A = rnorm(20, 1, 5), B = rnorm(20, 2, 5), C = rnorm(20, 3, 5), D = rnorm(20, 4, 5), E = rnorm(20, 5, 5)) Now I'd like to run a one-sample t-test on each column of the data.frame, to prove if it differs significantly from zero, like t.test(mydf$A) , and then store the mean of each column, the t-value and the p-value in a new data.frame. So the result should look something like this: A B C D E mean x x x x x t x x x x x p x x x x x I

Split overlapping intervals into non-overlapping intervals, within values of an identifier

一世执手 提交于 2019-12-07 22:30:10
问题 I would like to take a set of intervals, possibly overlapping, within categories of an identifier and create new intervals that are either exactly overlapping (ie same start/end values) or completely non-overlapping. These new intervals should collectively span the range of the original intervals and not include any ranges not in the original intervals. This needs to be a relatively fast operation because I'm working with lots of data. Here is some example data: library(data.table) set.seed

Column in the j-expression of a data.table (with/without a by statement)

心不动则不痛 提交于 2019-12-07 19:13:45
问题 Here are two artificial but I hope pedagogical examples of my problem. 1) When running this code: > dat0 <- data.frame(A=c("a","a","b"), B="") > data.table(dat0)[, lapply(.SD, function(x) length(A)) , by = "A"] A B 1: a 1 2: b 1 I expected the output A B 1: a 2 2: b 1 (similarly to plyr::ddply(dat0, .(A), nrow) ). Update to question 1) Let me give a less artificial example. Consider the following dataframe: dat0 <- data.frame(A=c("a","a","b"), x=c(1,2,3), y=c(9,8,7)) > dat0 A x y 1 a 1 9 2 a