subset | 易学教程

Pull nth Day of Month in XTS in R

阅读更多关于 Pull nth Day of Month in XTS in R

My questions is closely related to the one asked here: Pull Return from first business day of the month from XTS object using R . Instead of extracting the first day of each month, I want to extract, say the 10th data point of each month. How can I do this? Using the same example data from the question you've linked to, you can do some basic subsetting. Here's the sample data: library(xts) data(sample_matrix) x <- as.xts(sample_matrix) Here's the subsetting: x[format(index(x), "%d") == "10"] # Open High Low Close # 2007-01-10 49.91228 50.13053 49.91228 49.97246 # 2007-02-10 50.68923 50.72696

An array is subset of another array

阅读更多关于 An array is subset of another array

问题 How can I efficiently check to see whether all the elements in an integer array are subset of all elements of another Array in java? For example [33 11 23] is subset of [11 23 33 42]. Thanks in advance. 回答1: Make a HashSet out of the superset array. Check if each of the elements of the subset array are contained in the HashSet . This is a very fast operation. 回答2: If you're not bound to using Arrays, any Java collection has the containsAll method: boolean isSubset = bigList.containsAll

dplyr - filter by group size

阅读更多关于 dplyr - filter by group size

问题 What is the best way to filter a data.frame to only get groups of say size 5? So my data looks as follows: require(dplyr) n <- 1e5 x <- rnorm(n) # Category size ranging each from 1 to 5 cat <- rep(seq_len(n/3), sample(1:5, n/3, replace = TRUE))[1:n] dat <- data.frame(x = x, cat = cat) The dplyr way i could come up with was dat <- group_by(dat, cat) system.time({ out1 <- dat %>% filter(n() == 5L) }) # user system elapsed # 1.157 0.218 1.497 But this is very slow... Is there a better way in

R: fast (conditional) subsetting where feasible

阅读更多关于 R: fast (conditional) subsetting where feasible

问题 I would like to subset rows of my data library(data.table); set.seed(333); n <- 100 dat <- data.table(id=1:n, x=runif(n,100,120), y=runif(n,200,220), z=runif(n,300,320)) > head(dat) id x y z 1: 1 109.3400 208.6732 308.7595 2: 2 101.6920 201.0989 310.1080 3: 3 119.4697 217.8550 313.9384 4: 4 111.4261 205.2945 317.3651 5: 5 100.4024 212.2826 305.1375 6: 6 114.4711 203.6988 319.4913 in several stages. I am aware that I could apply subset(.) sequentially to achieve this. > s <- subset(dat, x>119)

Replace all values of a recursive list with values of a vector

阅读更多关于 Replace all values of a recursive list with values of a vector

问题 Say, I have the following recursive list: rec_list <- list(list(rep(1,5), 10), list(rep(100, 4), 20:25)) rec_list [[1]] [[1]][[1]] [1] 1 1 1 1 1 [[1]][[2]] [1] 10 [[2]] [[2]][[1]] [1] 100 100 100 100 [[2]][[2]] [1] 20 21 22 23 24 25 Now, I would like to replace all the values of the list, say, with the vector seq_along(unlist(rec_list)) , and keep the structure of the list. I tried using the empty index subsetting like rec_list[] <- seq_along(unlist(rec_list)) But this doesn't work. How can I

subsetting in data.table

阅读更多关于 subsetting in data.table

I am trying to subset a data.table ( from the package data.table ) in R (not a data.frame). I have a 4 digit year as a key. I would like to subset by taking a series of years. For example, I want to pull all the records that are from 1999, 2000, 2001. I have tried passing in my DT[J(year)] binary search syntax the following: 1999,2000,2001 c(1999,2000,2001) 1999, 2000, 2001 but none of these seem to work. Anyone know how to do a subset where the years you want to select are not just 1 but multiple years? What works for data.frame s works for data.table s. subset(DT, year %in% 1999:2001) Arthur

How to remove rows of a matrix by row name, rather than numerical index?

阅读更多关于 How to remove rows of a matrix by row name, rather than numerical index?

I have matrix g : > g[1:5,1:5] rs7510853 rs10154488 rs12159982 rs2844887 rs2844888 NA06985 "CC" "CC" "CC" "CC" "CC" NA06991 "CC" "CC" "CC" "CC" "CC" NA06993 "CC" "CC" "CC" "CC" "CC" NA06994 "CC" "CC" "CC" "CC" "CC" NA07000 "CC" "CC" "CC" "CC" "CC" > rownames(g)[1:2]->remove > remove [1] "NA06985" "NA06991" > g[-remove,] Error in -remove : invalid argument to unary operator Is there a simple way to do what I want to do here (remove the ID's referenced in the vector 'remove' from matrix g ? Note: this is just a model for what I actually want to do, please don't say just do g[-(1:2), ] , I need

Test if set is a subset, considering the number (multiplicity) of each element in the set

阅读更多关于 Test if set is a subset, considering the number (multiplicity) of each element in the set

问题 I know I can test if set1 is a subset of set2 with: {'a','b','c'} <= {'a','b','c','d','e'} # True But the following is also True: {'a','a','b','c'} <= {'a','b','c','d','e'} # True How do I have it consider the number of times an element in the set occurs so that: {'a','b','c'} <= {'a','b','c','d','e'} # True {'a','a','b','c'} <= {'a','b','c','d','e'} # False since 'a' is in set1 twice but set2 only once {'a','a','b','c'} <= {'a','a','b','c','d','e'} # True because both sets have two 'a'

Python: Check if one dictionary is a subset of another larger dictionary

阅读更多关于 Python: Check if one dictionary is a subset of another larger dictionary

I'm trying to write a custom filter method that takes an arbitrary number of kwargs and returns a list containing the elements of a database-like list that contain those kwargs . For example, suppose d1 = {'a':'2', 'b':'3'} and d2 = the same thing. d1 == d2 results in True. But suppose d2 = the same thing plus a bunch of other things. My method needs to be able to tell if d1 in d2 , but Python can't do that with dictionaries. Context: I have a Word class, and each object has properties like word , definition , part_of_speech , and so on. I want to be able to call a filter method on the main

Generate all “unique” subsets of a set (not a powerset)

阅读更多关于 Generate all “unique” subsets of a set (not a powerset)

问题 Let's say we have a Set S which contains a few subsets: - [a,b,c] - [a,b] - [c] - [d,e,f] - [d,f] - [e] Let's also say that S contains six unique elements: a, b, c, d, e and f . How can we find all possible subsets of S that contain each of the unique elements of S exactly once? The result of the function/method should be something like that: [[a,b,c], [d,e,f]]; [[a,b,c], [d,f], [e]]; [[a,b], [c], [d,e,f]]; [[a,b], [c], [d,f], [e]]. Is there any best practice or any standard way to achieve