R: How to get the Week number of the month

前端 未结 9 825
孤街浪徒
孤街浪徒 2020-11-30 04:05

I am new in R.
I want the week number of the month, which the date belongs to.

By using the following code:

>CurrentDate<-Sys.Date()
>We         


        
相关标签:
9条回答
  • 2020-11-30 04:39

    I am late to the party and maybe noone is gonna read this answer...

    Anyway, why not stay simple and do it like this:

    library(lubridate)
    
    x <- ymd(20200311, 20200308)
    
    week(x) - week(floor_date(x, unit = "months")) + 1
    
    [1] 3 2
    
    0 讨论(0)
  • 2020-11-30 04:40

    You can use day from the lubridate package. I'm not sure if there's a week-of-month type function in the package, but we can do the math.

    library(lubridate)
    curr <- Sys.Date()
    # [1] "2014-08-08"
    day(curr)               ## 8th day of the current month
    # [1] 8
    day(curr) / 7           ## Technically, it's the 1.14th week
    # [1] 1.142857
    ceiling(day(curr) / 7)  ## but ceiling() will take it up to the 2nd week.
    # [1] 2
    
    0 讨论(0)
  • 2020-11-30 04:46

    I don't know any build in functions but a work around would be

    CurrentDate <- Sys.Date()
    # The number of the week relative to the year
    weeknum <- as.integer( format(CurrentDate, format="%U") )
    
    # Find the minimum week of the month relative to the year
    mindatemonth <- as.Date( paste0(format(CurrentDate, "%Y-%m"), "-01") )
    weeknummin <- as.integer( format(mindatemonth, format="%U") ) # the number of the week of the first week within the month
    
    # Calculate the number of the week relative to the month
    weeknum <- weeknum - (weeknummin - 1) # this is as an integer
    
    # With the following you can convert the integer to the same format of 
    # format(CurrentDate, format="%U")
    formatC(weeknum, width = 2, flag = "0")
    
    0 讨论(0)
  • 2020-11-30 04:55

    Issue Overview

    It was difficult to tell which answers worked, so I built my own function nth_week and tested it against the others.

    The issue that's leading to most of the answers being incorrect is this:

    • The first week of a month is often a short-week
    • Same with the last week of the month

    For example, October 1st 2019 is a Tuesday, so 6 days into October (which is a Sunday) is already the second week. Also, contiguous months often share the same week in their respective counts, meaning that the last week of the prior month is commonly also the first week of the current month. So, we should expect a week count higher than 52 per year and some months that contain a span of 6 weeks.

    Results Comparison

    Here's a table showing examples where some of the above suggested algorithms go awry:

    DATE            Tori user206 Scri Klev Stringi Grot Frei Vale epi iso coni
    Fri-2016-01-01    1     1      1   1      5      1    1    1    1   1   1
    Sat-2016-01-02    1     1      1   1      1      1    1    1    1   1   1
    Sun-2016-01-03    2     1      1   1      1      2    2    1  -50   1   2
    Mon-2016-01-04    2     1      1   1      2      2    2    1  -50 -51   2
    ----
    Sat-2018-12-29    5     5      5   5      5      5    5    4    5   5   5
    Sun-2018-12-30    6     5      5   5      5      6    6    4  -46   5   6
    Mon-2018-12-31    6     5      5   5      6      6    6    4  -46 -46   6
    Tue-2019-01-01    1     1      1   1      6      1    1    1    1   1   1
    

    You can see that only Grothendieck, conighion, Freitas, and Tori are correct due to their treatment of partial week periods. I compared all days from year 100 to year 3000; there are no differences among those 4. (Stringi is probably correct for noting weekends as separate, incremented periods, but I didn't check to be sure; epiweek() and isoweek(), because of their intended uses, show some odd behavior near year-ends when using them for week incrementation.)

    Speed Comparison

    Below are the tests for efficiency between the implementations of: Tori, Grothendieck, Conighion, and Freitas

    # prep
    library(lubridate)
    library(tictoc)
    
    kepler<- ymd(15711227) # Kepler's birthday since it's a nice day and gives a long vector of dates
    some_dates<- seq(kepler, today(), by='day')
    
    
    # test speed of Tori algorithm
    tic(msg = 'Tori')
    Tori<- (5 + day(some_dates) + wday(floor_date(some_dates, 'month'))) %/% 7
    toc()
    Tori: 0.19 sec elapsed
    
    # test speed of Grothendieck algorithm
    wk <- function(x) as.numeric(format(x, "%U"))
    tic(msg = 'Grothendieck')
    Grothendieck<- (wk(some_dates) - wk(as.Date(cut(some_dates, "month"))) + 1)
    toc()
    Grothendieck: 1.99 sec elapsed
    
    # test speed of conighion algorithm
    tic(msg = 'conighion')
    weeknum <- as.integer( format(some_dates, format="%U") )
    mindatemonth <- as.Date( paste0(format(some_dates, "%Y-%m"), "-01") )
    weeknummin <- as.integer( format(mindatemonth, format="%U") ) # the number of the week of the first week within the month
    conighion <- weeknum - (weeknummin - 1) # this is as an integer
    toc()
    conighion: 2.42 sec elapsed
    
    # test speed of Freitas algorithm
    first_day_of_month_wday <- function(dx) {
       day(dx) <- 1
       wday(dx)
     }
    tic(msg = 'Freitas')
    Freitas<- ceiling((day(some_dates) + first_day_of_month_wday(some_dates) - 1) / 7)
    toc()
    Freitas: 0.97 sec elapsed
    



    Fastest correct algorithm by about at least 5X

    require(lubridate)

    (5 + day(some_dates) + wday(floor_date(some_dates, 'month'))) %/% 7

    # some_dates above is any vector of dates, like:
    some_dates<- seq(ymd(20190101), today(), 'day')
    



    Function Implementation

    I also wrote a generalized function for it that performs either month or year week counts, begins on a day you choose (i.e. say you want to start your week on Monday), labels output for easy checking, and is still extremely fast thanks to lubridate.

    nth_week<- function(dates = NULL,
                        count_weeks_in = c("month","year"),
                        begin_week_on = "Sunday"){
    
      require(lubridate)
    
      count_weeks_in<- tolower(count_weeks_in[1])
    
      # day_names and day_index are for beginning the week on a day other than Sunday
      # (this vector ordering matters, so careful about changing it)
      day_names<- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")
    
      # index integer of first match
      day_index<- pmatch(tolower(begin_week_on),
                         tolower(day_names))[1]
    
    
      ### Calculate week index of each day
    
      if (!is.na(pmatch(count_weeks_in, "year"))) {
    
        # For year:
        # sum the day of year, index for day of week at start of year, and constant 5 
        #  then integer divide quantity by 7   
        # (explicit on package so lubridate and data.table don't fight)
        n_week<- (5 + 
                    lubridate::yday(dates) + 
                    lubridate::wday(floor_date(dates, 'year'), 
                                    week_start = day_index)
        ) %/% 7
    
      } else {
    
        # For month:
        # same algorithm as above, but for month rather than year
        n_week<- (5 + 
                    lubridate::day(dates) + 
                    lubridate::wday(floor_date(dates, 'month'), 
                                    week_start = day_index)
        ) %/% 7
    
      }
    
      # naming very helpful for review
      names(n_week)<- paste0(lubridate::wday(dates,T), '-', dates)
    
      n_week
    
    }
    



    Function Output

    # Example raw vector output: 
    some_dates<- seq(ymd(20190930), today(), by='day')
    nth_week(some_dates)
    
    Mon-2019-09-30 Tue-2019-10-01 Wed-2019-10-02 
                 5              1              1 
    Thu-2019-10-03 Fri-2019-10-04 Sat-2019-10-05 
                 1              1              1 
    Sun-2019-10-06 Mon-2019-10-07 Tue-2019-10-08 
                 2              2              2 
    Wed-2019-10-09 Thu-2019-10-10 Fri-2019-10-11 
                 2              2              2 
    Sat-2019-10-12 Sun-2019-10-13 
                 2              3 
    
    # Example tabled output:
    library(tidyverse)
    
    nth_week(some_dates) %>% 
      enframe('DATE','nth_week_default') %>% 
      cbind(some_year_day_options = as.vector(nth_week(some_dates, count_weeks_in = 'year', begin_week_on = 'Mon')))
    
                 DATE nth_week_default some_year_day_options
    1  Mon-2019-09-30                5                    40
    2  Tue-2019-10-01                1                    40
    3  Wed-2019-10-02                1                    40
    4  Thu-2019-10-03                1                    40
    5  Fri-2019-10-04                1                    40
    6  Sat-2019-10-05                1                    40
    7  Sun-2019-10-06                2                    40
    8  Mon-2019-10-07                2                    41
    9  Tue-2019-10-08                2                    41
    10 Wed-2019-10-09                2                    41
    11 Thu-2019-10-10                2                    41
    12 Fri-2019-10-11                2                    41
    13 Sat-2019-10-12                2                    41
    14 Sun-2019-10-13                3                    41
    

    Hope this work saves people the time of having to weed through all the responses to figure out which are correct.

    0 讨论(0)
  • 2020-11-30 04:56

    Using lubridate you can do

    ceiling((day(date) + first_day_of_month_wday(date) - 1) / 7)
    

    Where the function first_day_of_month_wday returns the weekday of the first day of month.

    first_day_of_month_wday <- function(dx) {
      day(dx) <- 1
      wday(dx)
    }
    

    This adjustment must be done in order to get the correct week number otherwise if you have the 7th day of month on a Monday you will get 1 instead of 2, for example. This is only a shift in the day of month. The minus 1 is necessary because when the first day of month is sunday the adjustment is not needed, and the others weekdays follow this rule.

    0 讨论(0)
  • 2020-11-30 04:57

    There is a simple way to do it with lubridate package:

    isoweek() returns the week as it would appear in the ISO 8601 system, which uses a reoccurring leap week.

    epiweek() is the US CDC version of epidemiological week. It follows same rules as isoweek() but starts on Sunday. In other parts of the world the convention is to start epidemiological weeks on Monday, which is the same as isoweek().

    Reference here

    0 讨论(0)
提交回复
热议问题