subset

Pandas: Use iterrows on Dataframe subset

妖精的绣舞 提交于 2020-01-01 03:16:07
问题 What is the best way to do iterrows with a subset of a DataFrame? Let's take the following simple example: import pandas as pd df = pd.DataFrame({ 'Product': list('AAAABBAA'), 'Quantity': [5,2,5,10,1,5,2,3], 'Start' : [ DT.datetime(2013,1,1,9,0), DT.datetime(2013,1,1,8,5), DT.datetime(2013,2,5,14,0), DT.datetime(2013,2,5,16,0), DT.datetime(2013,2,8,20,0), DT.datetime(2013,2,8,16,50), DT.datetime(2013,2,8,7,0), DT.datetime(2013,7,4,8,0)]}) df = df.set_index(['Start']) Now I would like to

How to subset data with advance string matching

与世无争的帅哥 提交于 2019-12-31 10:43:37
问题 I have the following data frame from which I would like to extract rows based on matching strings. > GEMA_EO5 gene_symbol fold_EO p_value RefSeq_ID BH_p_value KNG1 3.433049 8.56e-28 NM_000893,NM_001102416 1.234245e-24 REXO4 3.245317 1.78e-27 NM_020385 2.281367e-24 VPS29 3.827665 2.22e-25 NM_057180,NM_016226 2.560770e-22 CYP51A1 3.363149 5.95e-25 NM_000786,NM_001146152 6.239386e-22 TNPO2 4.707600 1.60e-23 NM_001136195,NM_001136196,NM_013433 1.538000e-20 NSDHL 2.703922 6.74e-23 NM_001129765,NM

checking for equality

↘锁芯ラ 提交于 2019-12-31 07:49:19
问题 i want to check equality of a dataset. the data set is looking like this Equips <- c(1,1,1,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,5,6,7,8) Notifs <- c(10,10,20,55,63,67,71,73,73,73,81,81,83,32,32,32,32, 47,48,45,45,45,51,51,55,56,69,65,88) Comps <- c("Motor","Ventil","Motor","Gehäuse","Ventil","Motor","Steuerung","Motor", "Ventil","Gehäuse","Gehäuse","Ventil","Motor","Schraube","Motor","Festplatte", "Heizgerät","Motor","Schraube","Schraube","Lichtmaschine","Bremse","Lichtmaschine",

checking for equality

点点圈 提交于 2019-12-31 07:49:08
问题 i want to check equality of a dataset. the data set is looking like this Equips <- c(1,1,1,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,5,6,7,8) Notifs <- c(10,10,20,55,63,67,71,73,73,73,81,81,83,32,32,32,32, 47,48,45,45,45,51,51,55,56,69,65,88) Comps <- c("Motor","Ventil","Motor","Gehäuse","Ventil","Motor","Steuerung","Motor", "Ventil","Gehäuse","Gehäuse","Ventil","Motor","Schraube","Motor","Festplatte", "Heizgerät","Motor","Schraube","Schraube","Lichtmaschine","Bremse","Lichtmaschine",

How to extract a dataframe which is within a list in r, using a condition?

我怕爱的太早我们不能终老 提交于 2019-12-31 05:31:12
问题 I have a list which has dataframes of various dimensions. I want to extract those dataframes who rows greater than 30 I tried : DR<-sapply(list, function(x) subset(list,nrow(list$'x')=30)) But it is showing error. Please help! 回答1: Assuming your list is called list_df , we can use Filter Filter(function(x) nrow(x) == 30, list_df) Or sapply list_df[sapply(list_df, nrow) == 30] We can also use purrr::keep purrr::keep(list_df, ~nrow(.) == 30) 来源: https://stackoverflow.com/questions/58850863/how

how to pass an expression through a function for the subset function to evaluate in R

我与影子孤独终老i 提交于 2019-12-31 04:48:07
问题 i'm trying to write a subset method for a different object class that i'd like users to be able to execute the same way they use the subset.data.frame function. i've read a few related articles like this and this, but i don't think they're the solution here. i believe i'm using the wrong environment, but i don't understand enough about environments and also the substitute function to figure out why the first half of this code works but the second half doesn't. could anyone explain what i'm

How do I extract specific elements from an array?

旧巷老猫 提交于 2019-12-31 04:37:51
问题 If I have an array a = [1,2,3,4,5,6,7,8,9,10] and I want a subset of this array - the 1st, 5th and 7th elements. Is it possible to extract these from this array in a simple way. I was thinking something like: a[0,4,6] = [1,5,7] but that doesn't work. Also is there a way to return all indices except those specified? For example, something like a[-0,-4,-6] = [2,3,4,6,8,9,10] 回答1: Here's one way: [0,4,6].map{|i| a[i]} 回答2: You can simply do: [1] pry(main)> [1,2,3,4,5,6,7,8,9,10].values_at(0, 4,

How can I select a row by row name in a subsetted data frame in R?

▼魔方 西西 提交于 2019-12-31 03:41:08
问题 I want to select rows by name in a data frame that is a subset of a larger one. The subsetted data frame appears to have retained the names of the original data frame, such that: > DFsubset[1:3,] x1 x2 x3 271 3 5 2 553 2 4 1 563 2 5 3 while using the printed row name returns the following: > DFsubset[271,] Error in xj[i, , drop = FALSE] : subscript out of bounds How can I select these rows based on the row names from the original DF, ie. 271, 553, 563? 回答1: You need to reference the rownames

R Subset Dataset Using Regular Expression

对着背影说爱祢 提交于 2019-12-31 01:52:09
问题 Is there a way to make the R code below run quicker (i.e. vectorized to avoid use of for loops)? My example contains two data frames. First is dimension n1*p. One of the p columns contains names. Second data frame is a column vector (n2*1). It contains names as well. I want to keep all rows of the first data frame, where some part of the name in the column vector of the second data frame appears in the corresponding first data frame. Sorry for the brutal explanation. Example (Data frame 1): x

geom_smooth on a subset of data

天涯浪子 提交于 2019-12-30 17:23:51
问题 Here is some data and a plot: set.seed(18) data = data.frame(y=c(rep(0:1,3),rnorm(18,mean=0.5,sd=0.1)),colour=rep(1:2,12),x=rep(1:4,each=6)) ggplot(data,aes(x=x,y=y,colour=factor(colour)))+geom_point()+ geom_smooth(method='lm',formula=y~x,se=F) As you can see the linear regression is highly influenced by the values where x=1. Can I get linear regressions calculated for x >= 2 but display the values for x=1 (y equals either 0 or 1). The resulting graph would be exactly the same except for the