Django: Running a process each day at specified user local time

感情迁移 提交于 2021-02-19 09:23:17

问题


I'm porting my site to python/django, and one of the main exercises involves a set of data where users can schedule an event in their local time, and have it happen every day.

Currently i have a cron job (on another server) that hits up a method every e.g 5 minutes and see's if anything needs to be schedule over the next (let's say ) 10 minutes.

I store a Time value, and the user's local timezone for each job

What is the best way to do this?

Right now I am working on a function that:

  • Converts server time to user local time.
  • Creates a local datetime object localized "today" and the time the user specified
  • Checks to see if it is within 10 minutes of the user's alarm going off.
  • If it is between 23:50-23:59:59, and the user's set time is 00:00-00:10 The localized "today" is created with "tomorrow"'s date. (e.g If it is 2 minutes till midnight, and the user wants to have an event at 12:01, I calculate the event with tomorrow's date)
  • I set a last_scheduled field when it is scheduled and a last_fired field to ensure I don't send multiples.

If it is within 10 minutes from now, I schedule a task (thread, whatever) that will fire shortly.

Not really sure on best practice here. Should I:
Keep checking to see if I have any in the future and scheduling short lived tasks?
Pre-generate all of my times ahead of time (maybe a month at a time?)
Do something else entirely?
I was also thinking I could always just schedule the "next" event, but My worry would be that if say my server went offline, and I missed the "next" event, the next day would never get scheduled.

To Clarify:

  • I store the time and timezone per job (e.g Noon In US/Eastern).
  • I am correcting for DST, so when calculating UTC time I take today's date in utc, convert to local time, then use that to calculate the deltas. I am using pytz and normalize() to ensure I don't get any wonky DST issues.
  • I do have a last-scheduled, and last-ran time, to ensure I don't double-execute.

Looking at the solution below, I guess my only other observation is that if for whatever reason I missed a scheduled time, my "next" would never happen because it was then in the past. I suppose I could make a 2nd function to fix any missed alarms.

Edit: After grokking the answers below, I have come up with the following less-worse scenario:

I have the following fields

  • Last Event Execution Time
  • Last time event was scheduled
  • Next event execution time
  • Time of day, and timezone

I Calculate and set next_run_time whenever I: Update the event, or fire the event. This does the following:

  • If it has a last run time, the next_run_time is calculated, at least 2 hours in the future (avoid DST issues by adding some padding).
  • If the event has never been run, schedule at least 15 minutes in the future (avoid any multiple simultaneous schedules)

My scheduled job does the following:

  1. Checks all events that have a next_run_time in the next 15 minutes, and are not currently scheduled. Any matching are scheduled.

Scheduling a job:

  • Schedules a task, and sets the job as scheduled "now"

When the task executes(Success):

  • last_run_time is updated to "now"
  • next_run_time is recalculated

If a task fails: - The job is rescheduled 30 seconds in the future. If failing beyond a threshold (3 minutes overdue in my case), the task is aborted and next_run_time is recalculated for the following day. This gets logged and hopefully doesn't happen too much

This seems to mostly work because my events are always (daily), so I can afford to throw some padding in the times and avoid some hairy issues


回答1:


I'll keep off the Python/Django specifics, since that's not my area of expertise. But in general, a task scheduler of the type you are describing should act as follows (IMHO):

  • Separate the schedule definition from the execution time
  • The schedule definition should be defined in the users local time, and include the time zone id.
  • The execution time should be in terms of UTC.
  • When the task executes, it should calculate the next execution time from the schedule.

Let's run through an example.

  • The user says, "Run every night at midnight, in US Eastern Time".
  • We store a schedule of "Daily, 00:00, America/New_York".
  • We calculate the first execution time to be 2013-06-30T04:00:00Z.
  • Using whatever mechanism you like, run the job at the execution time. If you are polling periodically for jobs that need running, just see if the time has passed (ExecTime <= utcnow). If you can rely on an eventing system, cron job, etc., that's probably better.
  • When the job runs, use the schedule to calculate the next execution time.

Why schedule in local time? Well, in the case of Eastern time, it will transition between -5 hours from UTC and -4 hours, because of Daylight Saving Time. If the schedule was strictly UTC based, then after the DST transition you'd find jobs running at what the user perceived to be the wrong time.

Also, you should think about handling failures, retries, etc. And you don't want the job to run more than once per scheduled execution, so you might want a way to mark it as "in process" if you have more than one program checking for tasks. Sometimes you might need a more complex locking strategy to ensure multiple worker processes don't pick up the same task. This is a bit beyond the scope of what I can write here.

You should also think about how you want to handle ambiguities in local time caused by Daylight Saving Time transitions. Thinking about "fall-back" style transitions, if the user says to run at "1:30 AM every night", but there is one night a year where 1:30 happens twice, what do you want to do? If you do nothing special, it will run at the first occurrence - which is usually they daylight time. The user might expect the standard time, so you might have to check for this. Even if you just run at midnight, you're not exempt from this decision. There are several time zones that do their transition right at the stroke of midnight (Brazil, for example).

If all of this sounds like too much work, you might just want to look for a job scheduler that is already written. For example, Quartz on Java, or Quartz.Net on the .Net stack. I'm not directly familiar with it, but a search turned up APScheduler for Python, which looks pretty similar.




回答2:


(I would put this as a comment but SO does not permits them for new users) take a look also at celery, maybe it will help http://docs.celeryproject.org/en/latest/userguide/tasks.html



来源:https://stackoverflow.com/questions/17384596/django-running-a-process-each-day-at-specified-user-local-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!