Subsetting a range of values from a column variable based on the values of another value

一世执手 提交于 2019-12-13 19:42:36

问题


I am trying to keep all my rows in the dataframe but DROP the form 2 rows that do not fall with a minimum of 2 years of form 8.

library(tidyverse)   

forms <- data_frame( CASEID = rep(01012,5), VISIT = c(450, 450, 365, 365, 450), FORM = c(18, 8, 7, 2, 2), DTYvisit = c(2006, 2006, 2003, 2003, 2006) )

> forms # A tibble: 5 x 4

CASEID VISIT FORM YEAR   

<dbl> <dbl> <dbl> <dbl>   

1 1012 450 18 2006

2 1012 450 8 2006

3 1012 365 7 2003

4 1012 365 2 2003

5 1012 450 2 2004

6 1013 450 8 2003

7 1013 450 18 2003

8 1013 450 2 2003

9 1012 450 2 2009

Any suggestions on how I could drop rows of FORM 2 that do not fall within a < 2 year range of the FORM 8 DTyvisit?

This worked great:

form2.matchedOnForm8 <- forms %>% group_by(CASEID) %>% filter(FORM == 8) %>% select(CASEID, VISIT, DTYvisit) %>% left_join(filter(forms, FORM == 2), by = c("CASEID", "VISIT", "DTYvisit")) %>% bind_rows(filter(forms, FORM != 2))

but now I am losing observations.

I need the following:

library(tidyverse)

forms <- data_frame( CASEID = rep(01012,5), VISIT = c(450, 450, 365, 365, 450), FORM = c(18, 8, 7, 2, 2), DTYvisit = c(2006, 2006, 2003, 2003, 2006) )

> forms # A tibble: 5 x 4

CASEID VISIT FORM YEAR   

<dbl> <dbl> <dbl> <dbl>

1 1012 450 18 2006

2 1012 450 8 2006

3 1012 365 7 2003

4 1012 450 2 2004

5 1013 450 8 2003

6 1013 450 18 2003

7 1013 450 2 2003

回答1:


Here is a solution using outer to calculate the difference between the given YEAR and all YEAR values that FROM 8 might have.

min(abs(as.numeric(outer(df[df$FORM==8,'YEAR'],df[1,'YEAR'],'-'))))
[1] 0

df$diff <- apply(df, 1, function(x) min(as.numeric(outer(df[df$FORM==8,'YEAR',drop=TRUE],as.numeric(x['YEAR']),'-'))))

library(dplyr)
df %>% group_by(CASEID) %>% 
       filter(!(FORM==2 & abs(diff)>2))


df <- read.table(text="
    CASEID VISIT FORM YEAR   

                           1 1012 450 18 2006

                           2 1012 450 8 2006

                           3 1012 365 7 2003

                           4 1012 365 2 2003

                           5 1012 450 2 2004

                           6 1013 450 8 2003

                           7 1013 450 18 2003

                           8 1013 450 2 2003

                           9 1012 450 2 2009
                             ",header=T, stringsAsFactors = F)


来源:https://stackoverflow.com/questions/51546094/subsetting-a-range-of-values-from-a-column-variable-based-on-the-values-of-anoth

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!