subset

Using ifelse to remove unwanted rows from the dataset in R

◇◆丶佛笑我妖孽 提交于 2019-12-05 20:31:43
I have a dataset where I want to remove the occurences of month 11 in the first observation year for a couple of my individuals. Is it possible to do this with ifelse? Something like: ifelse(ID=="1" & Month=="11" and Year=="2006", "remove these rows", ifelse(ID=="2" & Month=="11" & Year=="2007", "remove these rows", "nothing")) As always, all help appreciated! :) You don't even need the ifelse() if all you want is an indicator of which to remove or not. ind <- (Month == "11") & ((ID == "1" & Year == "2006") | (ID == "2" & Year == "2007")) ind will contain a TRUE if Month is "11" and if either

template function with corresponding parameters to subset of tuple types

大兔子大兔子 提交于 2019-12-05 20:16:30
问题 I would like to write function as this find : multi_set<int, string, double, myType> m; //vector of tuples m.insert(/*some data*/); m.find<1,2>("something",2.123); Or m.find<0,3>(1,instanceOfMyType); m.find<1>("somethingelse"); Where find can be parametrized corresponding to any subset of tuple parameters. My code so far: template <typename ... T> class multi_set{ typedef tuple < T... > Tuple; vector<tuple<T...>> data = vector<tuple<T...>>(); public: void insert(T... t){ data.push_back(tuple

R - Subset based on column name

我只是一个虾纸丫 提交于 2019-12-05 19:25:39
My data frame has over 120 columns (variables) and I would like to create subsets bases on column names. For example I would like to create a subset where the column name includes the string "mood". Is this possible? I generally use SubData <- myData[,grep("whatIWant", colnames(myData))] I know very well that the "," is not necessary and colnames could be replaced by names but it would not work with matrices and I hate to change the formalism when changing objects. 来源: https://stackoverflow.com/questions/28815508/r-subset-based-on-column-name

Coq case analysis and rewrite with function returning subset types

只愿长相守 提交于 2019-12-05 19:25:04
I was working is this simple exercise about writing certified function using subset types. The idea is to first write a predecessor function pred : forall (n : {n : nat | n > 0}), {m : nat | S m = n.1}. and then using this definition give a funtion pred2 : forall (n : {n : nat | n > 1}), {m : nat | S (S m) = n.1}. I have no problem with the first one. Here is my code Program Definition pred (n : {n : nat | n > 0}) : {m : nat | S m = n.1} := match n with | O => _ | S n' => n' end. Next Obligation. elimtype False. compute in H. inversion H. Qed. But I cannot workout the second definition. I

Most efficient way of subsetting dataframes

不想你离开。 提交于 2019-12-05 17:50:15
Can anyone suggest more efficient way of subsetting dataframe without using SQL/indexing/data.table options? I looked for similar questions, and this one suggests indexing option. Here are ways to subset with timings. #Dummy data dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000)) #Subset and time system.time(x <- dat[dat$x > 500, ]) # user system elapsed # 0.092 0.000 0.090 system.time(x <- dat[which(dat$x > 500), ]) # user system elapsed # 0.040 0.032 0.070 system.time(x <- subset(dat, x > 500)) # user system elapsed # 0.108 0.004 0.109 EDIT: As Roland suggested I used

Subset/filter in dplyr chain with ggplot2

微笑、不失礼 提交于 2019-12-05 16:59:42
I'd like to make a slopegraph, along the lines (no pun intended) of this . Ideally, I'd like to do it all in a dplyr-style chain, but I hit a snag when I try to subset the data to add specific geom_text labels. Here's a toy example: # make tbl: df <- tibble( area = rep(c("Health", "Education"), 6), sub_area = rep(c("Staff", "Projects", "Activities"), 4), year = c(rep(2016, 6), rep(2017, 6)), value = rep(c(15000, 12000, 18000), 4) ) %>% arrange(area) # plot: df %>% filter(area == "Health") %>% ggplot() + geom_line(aes(x = as.factor(year), y = value, group = sub_area, color = sub_area), size = 2

Subsetting one matrix based in another matrix

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-05 16:53:18
I would like to select the R based on G strings to obtain separated outputs with equal dimensions. This are my inputs: R <- 'pr_id sample1 sample2 sample3 AX-1 100 120 130 AX-2 150 180 160 AX-3 160 120 196' R <- read.table(text=R, header=T) G <- 'pr_id sample1 sample2 sample3 AX-1 AB AA AA AX-2 BB AB NA AX-3 BB AB AA' G <- read.table(text=G, header=T) This are my expected outputs: AA <- 'pr_id sample1 sample2 sample3 AX-1 NA 120 130 AX-2 NA NA NA AX-3 NA NA 196' AA <- read.table(text=AA, header=T) AB <- 'pr_id sample1 sample2 sample3 AX-1 100 NA NA AX-2 NA 180 NA AX-3 NA 120 NA' AB <- read

Subset columns based on list of column names and bring the column before it

余生颓废 提交于 2019-12-05 15:08:13
问题 I have a larger dataset following the same order, a unique date column, data, unique date column, date, etc. I am trying to subset not just the data column by name but the unique date column also. The code below selects columns based on a list of names, which is part of what I want but any ideas of how I can grab the column immediately before the subsetted column also? Looking to end up with a DF containing Date1, Fire, Date3, Earth columns (using just the NameList). Here is my reproducible

Determine which column name is causing 'undefined columns selected' error when using subset()

亡梦爱人 提交于 2019-12-05 12:31:45
I'm trying to subset a large data frame from a very large data frame, using data.new <- subset(data, select = vector) where vector is a character string containing the column names I'm trying to isolate. When I do this I get Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Is there a way to identify which specific column name in the vector is undefined? Through trial and error I've narrowed it down to about 400, but that still doesn't help. Find the elements of your vector that are not %in% the names() of your data frame. Working example: dd <- data.frame(a=1,b=2)

How to select some rows with specific rownames from a dataframe? [closed]

我怕爱的太早我们不能终老 提交于 2019-12-05 12:24:05
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have a data frame with several rows. I want to select some rows with specific rownames (such as stu2,stu3,stu5,stu9 ) from this dataframe. The input example dataframe is as follows: attr1 attr2 attr3 attr4 stu1 0 0 1 0 stu2 -1 1 -1 1 stu3 1 -1 0 -1 stu4 1 -1 1 -1 stu5 -1 1 0 1 stu6 1 -1 1 0 stu7 -1 -1 -1 1