subset

Pull nth Day of Month in XTS in R

走远了吗. 提交于 2019-11-28 10:25:11
My questions is closely related to the one asked here: Pull Return from first business day of the month from XTS object using R . Instead of extracting the first day of each month, I want to extract, say the 10th data point of each month. How can I do this? Using the same example data from the question you've linked to, you can do some basic subsetting. Here's the sample data: library(xts) data(sample_matrix) x <- as.xts(sample_matrix) Here's the subsetting: x[format(index(x), "%d") == "10"] # Open High Low Close # 2007-01-10 49.91228 50.13053 49.91228 49.97246 # 2007-02-10 50.68923 50.72696

An array is subset of another array

随声附和 提交于 2019-11-28 10:03:59
问题 How can I efficiently check to see whether all the elements in an integer array are subset of all elements of another Array in java? For example [33 11 23] is subset of [11 23 33 42]. Thanks in advance. 回答1: Make a HashSet out of the superset array. Check if each of the elements of the subset array are contained in the HashSet . This is a very fast operation. 回答2: If you're not bound to using Arrays, any Java collection has the containsAll method: boolean isSubset = bigList.containsAll

dplyr - filter by group size

流过昼夜 提交于 2019-11-28 07:42:19
问题 What is the best way to filter a data.frame to only get groups of say size 5? So my data looks as follows: require(dplyr) n <- 1e5 x <- rnorm(n) # Category size ranging each from 1 to 5 cat <- rep(seq_len(n/3), sample(1:5, n/3, replace = TRUE))[1:n] dat <- data.frame(x = x, cat = cat) The dplyr way i could come up with was dat <- group_by(dat, cat) system.time({ out1 <- dat %>% filter(n() == 5L) }) # user system elapsed # 1.157 0.218 1.497 But this is very slow... Is there a better way in

R: fast (conditional) subsetting where feasible

北城余情 提交于 2019-11-28 07:31:53
问题 I would like to subset rows of my data library(data.table); set.seed(333); n <- 100 dat <- data.table(id=1:n, x=runif(n,100,120), y=runif(n,200,220), z=runif(n,300,320)) > head(dat) id x y z 1: 1 109.3400 208.6732 308.7595 2: 2 101.6920 201.0989 310.1080 3: 3 119.4697 217.8550 313.9384 4: 4 111.4261 205.2945 317.3651 5: 5 100.4024 212.2826 305.1375 6: 6 114.4711 203.6988 319.4913 in several stages. I am aware that I could apply subset(.) sequentially to achieve this. > s <- subset(dat, x>119)

Replace all values of a recursive list with values of a vector

懵懂的女人 提交于 2019-11-28 07:03:44
问题 Say, I have the following recursive list: rec_list <- list(list(rep(1,5), 10), list(rep(100, 4), 20:25)) rec_list [[1]] [[1]][[1]] [1] 1 1 1 1 1 [[1]][[2]] [1] 10 [[2]] [[2]][[1]] [1] 100 100 100 100 [[2]][[2]] [1] 20 21 22 23 24 25 Now, I would like to replace all the values of the list, say, with the vector seq_along(unlist(rec_list)) , and keep the structure of the list. I tried using the empty index subsetting like rec_list[] <- seq_along(unlist(rec_list)) But this doesn't work. How can I

subsetting in data.table

时光毁灭记忆、已成空白 提交于 2019-11-28 06:55:46
I am trying to subset a data.table ( from the package data.table ) in R (not a data.frame). I have a 4 digit year as a key. I would like to subset by taking a series of years. For example, I want to pull all the records that are from 1999, 2000, 2001. I have tried passing in my DT[J(year)] binary search syntax the following: 1999,2000,2001 c(1999,2000,2001) 1999, 2000, 2001 but none of these seem to work. Anyone know how to do a subset where the years you want to select are not just 1 but multiple years? What works for data.frame s works for data.table s. subset(DT, year %in% 1999:2001) Arthur

How to remove rows of a matrix by row name, rather than numerical index?

六眼飞鱼酱① 提交于 2019-11-28 06:15:47
I have matrix g : > g[1:5,1:5] rs7510853 rs10154488 rs12159982 rs2844887 rs2844888 NA06985 "CC" "CC" "CC" "CC" "CC" NA06991 "CC" "CC" "CC" "CC" "CC" NA06993 "CC" "CC" "CC" "CC" "CC" NA06994 "CC" "CC" "CC" "CC" "CC" NA07000 "CC" "CC" "CC" "CC" "CC" > rownames(g)[1:2]->remove > remove [1] "NA06985" "NA06991" > g[-remove,] Error in -remove : invalid argument to unary operator Is there a simple way to do what I want to do here (remove the ID's referenced in the vector 'remove' from matrix g ? Note: this is just a model for what I actually want to do, please don't say just do g[-(1:2), ] , I need

Test if set is a subset, considering the number (multiplicity) of each element in the set

牧云@^-^@ 提交于 2019-11-28 04:43:47
问题 I know I can test if set1 is a subset of set2 with: {'a','b','c'} <= {'a','b','c','d','e'} # True But the following is also True: {'a','a','b','c'} <= {'a','b','c','d','e'} # True How do I have it consider the number of times an element in the set occurs so that: {'a','b','c'} <= {'a','b','c','d','e'} # True {'a','a','b','c'} <= {'a','b','c','d','e'} # False since 'a' is in set1 twice but set2 only once {'a','a','b','c'} <= {'a','a','b','c','d','e'} # True because both sets have two 'a'

Python: Check if one dictionary is a subset of another larger dictionary

…衆ロ難τιáo~ 提交于 2019-11-28 04:15:45
I'm trying to write a custom filter method that takes an arbitrary number of kwargs and returns a list containing the elements of a database-like list that contain those kwargs . For example, suppose d1 = {'a':'2', 'b':'3'} and d2 = the same thing. d1 == d2 results in True. But suppose d2 = the same thing plus a bunch of other things. My method needs to be able to tell if d1 in d2 , but Python can't do that with dictionaries. Context: I have a Word class, and each object has properties like word , definition , part_of_speech , and so on. I want to be able to call a filter method on the main

Generate all “unique” subsets of a set (not a powerset)

天涯浪子 提交于 2019-11-28 03:53:19
问题 Let's say we have a Set S which contains a few subsets: - [a,b,c] - [a,b] - [c] - [d,e,f] - [d,f] - [e] Let's also say that S contains six unique elements: a, b, c, d, e and f . How can we find all possible subsets of S that contain each of the unique elements of S exactly once? The result of the function/method should be something like that: [[a,b,c], [d,e,f]]; [[a,b,c], [d,f], [e]]; [[a,b], [c], [d,e,f]]; [[a,b], [c], [d,f], [e]]. Is there any best practice or any standard way to achieve