subset

R: subset a data frame based on conditions from another data frame

落爺英雄遲暮 提交于 2019-12-01 09:36:39
Here is a problem I am trying to solve. Say, I have two data frames like the following: observations <- data.frame(id = rep(rep(c(1,2,3,4), each=5), 5), time = c(rep(1:5,4), rep(6:10,4), rep(11:15,4), rep(16:20,4), rep(21:25,4)), measurement = rnorm(100,5,7)) sampletimes <- data.frame(location = letters[1:20], id = rep(1:4,5), time1 = rep(c(2,7,12,17,22), each=4), time2 = rep(c(4,9,14,19,24), each=4)) They both contain a column named id , which links the data frames. I want to have the measurement s from observationss for which time is between time1 and time2 from the sampletimes data frame.

Java, multiple iterators on a set, removing proper subsets and ConcurrentModificationException

雨燕双飞 提交于 2019-12-01 09:28:05
I have a set A = {(1,2), (1,2,3), (2,3,4), (3,4), (1)} I want to turn it into A={(1,2,3), (2,3,4)}, remove proper subsets from this set. I'm using a HashSet to implement the set, 2 iterator to run through the set and check all pairs for proper subset condition using containsAll(c), and the remove() method to remove proper subsets. the code looks something like this: HashSet<Integer> hs.... Set<Integer> c=hs.values(); Iterator<Integer> it= c.iterator(); while(it.hasNext()) { p=it.next(); Iterator<Integer> it2= c.iterator(); while(it2.hasNext()) { q=it2.next(); if q is a subset of p it2.remove()

Data.table: how to get the blazingly fast subsets it promises and apply to a second data.table

自作多情 提交于 2019-12-01 09:15:13
问题 I'm trying to enrich one dataset (adherence) based on subsets from another (lsr). For each individual row in adherence, I want to calculate (as a third column) the medication available for implementing the prescribed regimen. I have a function that returns the relevant result, but it runs for days on just a subset of the total data I have to run it on. The datasets are: library(dplyr) library(tidyr) library(lubridate) library(data.table) adherence <- cbind.data.frame(c("1", "2", "3", "1", "2"

rolling computations in xts by month

时间秒杀一切 提交于 2019-12-01 08:52:32
I am familiar with the zoo function rollapply which allows you to do rolling computations on zoo or xts objects and you can specify the rolling increment via the by parameter. I am specifically interested in applying a function every month but using all of the past daily data in the computation. For example say my data set looks like this: dte, val 1/01/2001, 10 1/02/2001, 11 ... 1/31/2001, 2 2/01/2001, 54 2/02/2001, 34 ... 2/30/2001, 29 I would like to select the end of each month and apply a function that uses all the daily data. This doesn't seem like it would work with rollapply since the

subset inside a function by the variables specified in ddply

我与影子孤独终老i 提交于 2019-12-01 08:44:51
Often I need to subset a data.frame inside a function by the variables that I am subsetting another data.frame to which I apply ddply. To do that I explicitly write again the variables inside the function and I wonder whether there is a more elegant way to do that. Below I include a trivial example just to show which is my current approach to do this. d1<-expand.grid(x=c('a','b'),y=c('c','d'),z=1:3) d2<-expand.grid(x=c('a','b'),y=c('c','d'),z=4:6) results<-ddply(d1,.(x,y),function(d) { d2Sub<-subset(d2,x==unique(d$x) & y==unique(d$y)) out<-d$z+d2Sub$z data.frame(out) }) The plyr package offers

subsetting list in R

会有一股神秘感。 提交于 2019-12-01 08:35:38
I'm using Mcomp package in R which contains dataset for forecasting. The data is organized as yearly, quarterly and monthly frequencies. I can easily subset this into a list but cannot further subset using additional condition. ##Subset monthly data library("Mcomp") mon <- subset(M3,"monthly") Each element in the mon list has following structure, as an example mon$N1500 has the following struture $ N1500:List of 9 ..$ st : chr "M99" ..$ type : chr "MICRO" ..$ period : chr "MONTHLY" ..$ description: chr "SHIPMENTS (Code TD-30USA)" ..$ sn : chr "N1500" ..$ x : Time-Series [1:51] from 1990 to

Access entries in pandas data frame using a list of indices

£可爱£侵袭症+ 提交于 2019-12-01 08:00:01
I facing the issue that I need only a subset of a my original dataframe that is distributed over different rows and columns. E.g.: # My Original dataframe import pandas as pd dfTest = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]]) Output: 0 1 2 0 1 2 3 1 4 5 6 2 7 8 9 I can provide a list with rows and column indices where my desired values are located: array_indices = [[0,2],[1,0],[2,1]] My desired output is a series: 3 4 8 Can anyone help? Use pd.DataFrame.lookup dfTest.lookup(*zip(*array_indices)) array([3, 4, 8]) Which you can wrap in a pd.Series constructor pd.Series(dfTest.lookup(*zip(*array

Java, multiple iterators on a set, removing proper subsets and ConcurrentModificationException

江枫思渺然 提交于 2019-12-01 06:42:02
问题 I have a set A = {(1,2), (1,2,3), (2,3,4), (3,4), (1)} I want to turn it into A={(1,2,3), (2,3,4)}, remove proper subsets from this set. I'm using a HashSet to implement the set, 2 iterator to run through the set and check all pairs for proper subset condition using containsAll(c), and the remove() method to remove proper subsets. the code looks something like this: HashSet<Integer> hs.... Set<Integer> c=hs.values(); Iterator<Integer> it= c.iterator(); while(it.hasNext()) { p=it.next();

R: subset a data frame based on conditions from another data frame

巧了我就是萌 提交于 2019-12-01 06:21:47
问题 Here is a problem I am trying to solve. Say, I have two data frames like the following: observations <- data.frame(id = rep(rep(c(1,2,3,4), each=5), 5), time = c(rep(1:5,4), rep(6:10,4), rep(11:15,4), rep(16:20,4), rep(21:25,4)), measurement = rnorm(100,5,7)) sampletimes <- data.frame(location = letters[1:20], id = rep(1:4,5), time1 = rep(c(2,7,12,17,22), each=4), time2 = rep(c(4,9,14,19,24), each=4)) They both contain a column named id , which links the data frames. I want to have the

Subset check with slices in Go

萝らか妹 提交于 2019-12-01 06:07:46
I am looking for a efficient way to check if a slice is a subset of another. I could simply iterate over them to check, but I feel there has to be a better way. E.g. {1, 2, 3} is a subset of {1, 2, 3, 4} {1, 2, 2} is NOT a subset of {1, 2, 3, 4} What is the best way to do this efficiently? Thanks! I think the most common way to solve a subset problem is via a map. package main import "fmt" // subset returns true if the first array is completely // contained in the second array. There must be at least // the same number of duplicate values in second as there // are in first. func subset(first,