Check if a date is within an interval in R

前端 未结 6 920
暗喜
暗喜 2020-11-30 12:48

I have these three intervals defined:

YEAR_1  <- interval(ymd(\'2002-09-01\'), ymd(\'2003-08-31\'))
YEAR_2  <- interval(ymd(\'2003-09-01\'), ymd(\'20         


        
6条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-30 13:35

    Everybody has their favourite tool for this, mine happens to be data.table because of what it refers to as its dt[i, j, by] logic.

    library(data.table)
    
    dt <- data.table(date = as.IDate(pt))
    
    dt[, YR := 0.0 ]                        # I am using a numeric for year here...
    
    dt[ date >= as.IDate("2002-09-01") & date <= as.IDate("2003-08-31"), YR := 1 ]
    dt[ date >= as.IDate("2003-09-01") & date <= as.IDate("2004-08-31"), YR := 2 ]
    dt[ date >= as.IDate("2004-09-01") & date <= as.IDate("2005-08-31"), YR := 3 ]
    

    I create a data.table object, converting your times to date for later comparison. I then set up a new column, defaulting to one.

    We then execute three conditional statements: for each of the three intervals (which I just create by hand using the endpoints), we set the YR value to 1, 2 or 3.

    This does have the desired effect as we can see from

    R> print(dt, topn=5, nrows=10)
               date YR
      1: 2003-06-11  1
      2: 2004-08-11  2
      3: 2004-06-03  2
      4: 2004-01-20  2
      5: 2005-02-25  3
     ---              
     96: 2002-08-07  0
     97: 2004-02-04  2
     98: 2006-04-10  0
     99: 2005-03-21  3
    100: 2003-12-01  2
    R> table(dt[, YR])
    
     0  1  2  3 
    26 31 31 12 
    R> 
    

    One could have done this also simply by computing date differences and truncating down, but it is also nice to be a little explicit at times.

    Edit: A more generic form just uses arithmetic on the dates:

    R> dt[, YR2 := trunc(as.numeric(difftime(as.Date(date), 
    +                                        as.Date("2001-09-01"),
    +                                        unit="days"))/365.25)]
    R> table(dt[, YR2])
    
     0  1  2  3  4  5  6  7  9 
     7 31 31 12  9  5  1  2  1 
    R> 
    

    This does the job in one line.

提交回复
热议问题