subset | 易学教程

Find rows in a dataframe with a certain date using subset

阅读更多关于 Find rows in a dataframe with a certain date using subset

问题 I have a dataframe Date , containing dates , times and values: Date Time Global_active_power 16/12/2006 17:24:00 4.216 16/12/2006 18:25:00 4.5 17/12/2006 17:25:00 4.52 18/12/2006 17:25:00 4.557 Now I want to find a row depending on the date - for example all rows with date > 16/12/2006. This is my code: Data$Date<- as.Date(Data$Date,"%dd%mm%yyyy" ) Data$Time<-strptime(Data$Time, "%h%m%s") print(class(Data$Date)) print(class(Data$Time)) Data1<-subset(Data, (Date=="16/12/2006" )) View(Data1)

how find or match one data frame as a subset(full) into another data frame in R?

阅读更多关于 how find or match one data frame as a subset(full) into another data frame in R?

问题 I have two data frames df1 and df2 given below. df1 is c1 c2 c3 c4 B 2.34000 1.00 I A 14.43000 2.10 J D 3.45515 1.00 K B 2.50000 2.09 A 2.44000 1.10 K K 5.00000 1.09 L df2 is: c1 c2 c3 B 2.34 1.00 A 14.43 2.10 D 3.43 1.00 B 2.50 2.09 E 5.00 1.09 A 2.44 1.10 the requirement here is like this: there is matching(or comparison) between these two data frames. if df2 completely found ( that means the content of df2 matched with any subset of df1 irrespective of the order ) in df1 (either exactly

Subset a data.table by matching columns of another data.table

阅读更多关于 Subset a data.table by matching columns of another data.table

问题 I have been searching for a solution for subsetting a data table using matching values for certain columns in another data table. Here is in example: set.seed(2) dt <- data.table(a = 1:10, b = rnorm(10), c = runif(10), d = letters[1:10]) dt2 <- data.table(a = 5:20, b = rnorm(16), c = runif(16), d = letters[5:20]) This is the result I need: > dt2 1: 5 -2.311069085 0.62512173 e 2: 6 0.878604581 0.26030004 f 3: 7 0.035806718 0.85907312 g 4: 8 1.012828692 0.43748800 h 5: 9 0.432265155 0.38814476

Subset a data.table by matching columns of another data.table

阅读更多关于 Subset a data.table by matching columns of another data.table

Conditional sum with output for all rows in r data.table

阅读更多关于 Conditional sum with output for all rows in r data.table

问题 I have a coding issue what I think should be very easy. I have created a simplified dataset: DT <- data.table(Bank=rep(c("a","b","c"),4), Type=rep(c("Ass","Liab"),6), Amount=c(100,200,300,400,200,300,400,500,200,100,300,100)) # Bank Type Amount SumLiab # 1: a Ass 100 NA # 2: b Liab 200 700 # 3: c Ass 300 NA # 4: a Liab 400 500 # 5: b Ass 200 NA # 6: c Liab 300 400 # 7: a Ass 400 NA # 8: b Liab 500 700 # 9: c Ass 200 NA # 10: a Liab 100 500 # 11: b Ass 300 NA # 12: c Liab 100 400 I want to

Given an amount of sets with numbers, find a set of numbers not including any of the given

阅读更多关于 Given an amount of sets with numbers, find a set of numbers not including any of the given

问题 Given an amount of sets with numbers (0-20 e.g) , we are asked to find the maximum set of numbers from 0-20 that doesn't include any of the given sets(it can include numbers from a set,but not the whole set) For example :Setting the max number 8 and given the sets {1,2} {2,3} {7} {3,4} {5,6,4}, one maximum solution is the set {1, 3, 5, 6, 8}. I was thinking of representing it as a graph and then inducting it to the Max Independent Set problem, but that seems to work only if the sets were

pandas: rapidly calculating sum of column with certain values

阅读更多关于 pandas: rapidly calculating sum of column with certain values

问题 I have a pandas dataframe and I need to calculate the sum of a column of values that fall within a certain window. So for instance, if I have a window of 500, and my initial value is 1000, I want to sum all values that are between 499 and 999, and also between 1001 and 1501. This is easier to explain with some data: chrom pos end AFR EUR pi 0 1 10177 10177 0.4909 0.4056 0.495988 1 1 10352 10352 0.4788 0.4264 0.496369 2 1 10617 10617 0.9894 0.9940 0.017083 3 1 11008 11008 0.1346 0.0885 0

subsetting data using multiple variables in R

阅读更多关于 subsetting data using multiple variables in R

问题 I have a data set, DATA, with many variables. DATA has a list mode and its class is a data.frame. The variables I'm concerned with are AGE.MONTHS and LOCATION. I need to subset DATA into another set called SUB, and I want SUB to only contain the observations where AGE.MONTHS <= 2 and LOCATION = "Area A". AGE.MONTHS is has a numeric mode and class. LOCATION has a numeric mode and its class is a factor. I have tried the following, SUB<-which((DATA$AGE.MONTHS <= 2 )& (DATA$LOCATION=="Area A"))

Subsetting a data frame to include 20 rows before and after

阅读更多关于 Subsetting a data frame to include 20 rows before and after

问题 I know the title is kind of lame but I couldn't think of anything else to call this. I’m trying to subset a large data frame using the values that appear in the lon (longitude column). The current subsetting script I have works, and it creates subsets any time a -180 (the n/a value) appears, and includes the first non -180 number before and after one or more -180s is present. My problem is that I would like the subsets to be comprised of the 20 longitudes before any -180s, and 20 after. Since

R Array subsetting: flexible use of drop

阅读更多关于 R Array subsetting: flexible use of drop

问题 As it has been noticed in Subsetting R array: dimension lost when its length is 1 R drops every dimension when subsetting and its length is 1. The drop property helps avoid that. I need a more flexible way to subset : > arr = array(1, dim= c(1,2,3,4)) > dim(arr[,,1,]) [1] 2 4 > dim(arr[,,1,,drop=F]) [1] 1 2 1 4 I want a way to subset by dropping the 3rd dimension (actually the dimension where I put the subset 1) and keepping the 1st dimension (the dimensions where no subset is put). It should