subset | 易学教程

Using ifelse to remove unwanted rows from the dataset in R

阅读更多关于 Using ifelse to remove unwanted rows from the dataset in R

I have a dataset where I want to remove the occurences of month 11 in the first observation year for a couple of my individuals. Is it possible to do this with ifelse? Something like: ifelse(ID=="1" & Month=="11" and Year=="2006", "remove these rows", ifelse(ID=="2" & Month=="11" & Year=="2007", "remove these rows", "nothing")) As always, all help appreciated! :) You don't even need the ifelse() if all you want is an indicator of which to remove or not. ind <- (Month == "11") & ((ID == "1" & Year == "2006") | (ID == "2" & Year == "2007")) ind will contain a TRUE if Month is "11" and if either

template function with corresponding parameters to subset of tuple types

阅读更多关于 template function with corresponding parameters to subset of tuple types

问题 I would like to write function as this find : multi_set<int, string, double, myType> m; //vector of tuples m.insert(/*some data*/); m.find<1,2>("something",2.123); Or m.find<0,3>(1,instanceOfMyType); m.find<1>("somethingelse"); Where find can be parametrized corresponding to any subset of tuple parameters. My code so far: template <typename ... T> class multi_set{ typedef tuple < T... > Tuple; vector<tuple<T...>> data = vector<tuple<T...>>(); public: void insert(T... t){ data.push_back(tuple

R - Subset based on column name

阅读更多关于 R - Subset based on column name

My data frame has over 120 columns (variables) and I would like to create subsets bases on column names. For example I would like to create a subset where the column name includes the string "mood". Is this possible? I generally use SubData <- myData[,grep("whatIWant", colnames(myData))] I know very well that the "," is not necessary and colnames could be replaced by names but it would not work with matrices and I hate to change the formalism when changing objects. 来源： https://stackoverflow.com/questions/28815508/r-subset-based-on-column-name

Coq case analysis and rewrite with function returning subset types

阅读更多关于 Coq case analysis and rewrite with function returning subset types

I was working is this simple exercise about writing certified function using subset types. The idea is to first write a predecessor function pred : forall (n : {n : nat | n > 0}), {m : nat | S m = n.1}. and then using this definition give a funtion pred2 : forall (n : {n : nat | n > 1}), {m : nat | S (S m) = n.1}. I have no problem with the first one. Here is my code Program Definition pred (n : {n : nat | n > 0}) : {m : nat | S m = n.1} := match n with | O => _ | S n' => n' end. Next Obligation. elimtype False. compute in H. inversion H. Qed. But I cannot workout the second definition. I

Most efficient way of subsetting dataframes

阅读更多关于 Most efficient way of subsetting dataframes

Can anyone suggest more efficient way of subsetting dataframe without using SQL/indexing/data.table options? I looked for similar questions, and this one suggests indexing option. Here are ways to subset with timings. #Dummy data dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000)) #Subset and time system.time(x <- dat[dat$x > 500, ]) # user system elapsed # 0.092 0.000 0.090 system.time(x <- dat[which(dat$x > 500), ]) # user system elapsed # 0.040 0.032 0.070 system.time(x <- subset(dat, x > 500)) # user system elapsed # 0.108 0.004 0.109 EDIT: As Roland suggested I used

Subset/filter in dplyr chain with ggplot2

阅读更多关于 Subset/filter in dplyr chain with ggplot2

I'd like to make a slopegraph, along the lines (no pun intended) of this . Ideally, I'd like to do it all in a dplyr-style chain, but I hit a snag when I try to subset the data to add specific geom_text labels. Here's a toy example: # make tbl: df <- tibble( area = rep(c("Health", "Education"), 6), sub_area = rep(c("Staff", "Projects", "Activities"), 4), year = c(rep(2016, 6), rep(2017, 6)), value = rep(c(15000, 12000, 18000), 4) ) %>% arrange(area) # plot: df %>% filter(area == "Health") %>% ggplot() + geom_line(aes(x = as.factor(year), y = value, group = sub_area, color = sub_area), size = 2

Subsetting one matrix based in another matrix

阅读更多关于 Subsetting one matrix based in another matrix

I would like to select the R based on G strings to obtain separated outputs with equal dimensions. This are my inputs: R <- 'pr_id sample1 sample2 sample3 AX-1 100 120 130 AX-2 150 180 160 AX-3 160 120 196' R <- read.table(text=R, header=T) G <- 'pr_id sample1 sample2 sample3 AX-1 AB AA AA AX-2 BB AB NA AX-3 BB AB AA' G <- read.table(text=G, header=T) This are my expected outputs: AA <- 'pr_id sample1 sample2 sample3 AX-1 NA 120 130 AX-2 NA NA NA AX-3 NA NA 196' AA <- read.table(text=AA, header=T) AB <- 'pr_id sample1 sample2 sample3 AX-1 100 NA NA AX-2 NA 180 NA AX-3 NA 120 NA' AB <- read

Subset columns based on list of column names and bring the column before it

阅读更多关于 Subset columns based on list of column names and bring the column before it

问题 I have a larger dataset following the same order, a unique date column, data, unique date column, date, etc. I am trying to subset not just the data column by name but the unique date column also. The code below selects columns based on a list of names, which is part of what I want but any ideas of how I can grab the column immediately before the subsetted column also? Looking to end up with a DF containing Date1, Fire, Date3, Earth columns (using just the NameList). Here is my reproducible

Determine which column name is causing 'undefined columns selected' error when using subset()

阅读更多关于 Determine which column name is causing 'undefined columns selected' error when using subset()

I'm trying to subset a large data frame from a very large data frame, using data.new <- subset(data, select = vector) where vector is a character string containing the column names I'm trying to isolate. When I do this I get Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Is there a way to identify which specific column name in the vector is undefined? Through trial and error I've narrowed it down to about 400, but that still doesn't help. Find the elements of your vector that are not %in% the names() of your data frame. Working example: dd <- data.frame(a=1,b=2)

How to select some rows with specific rownames from a dataframe? [closed]

阅读更多关于 How to select some rows with specific rownames from a dataframe? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have a data frame with several rows. I want to select some rows with specific rownames (such as stu2,stu3,stu5,stu9 ) from this dataframe. The input example dataframe is as follows: attr1 attr2 attr3 attr4 stu1 0 0 1 0 stu2 -1 1 -1 1 stu3 1 -1 0 -1 stu4 1 -1 1 -1 stu5 -1 1 0 1 stu6 1 -1 1 0 stu7 -1 -1 -1 1