celery

Running “unique” tasks with celery

本秂侑毒 提交于 2019-11-27 00:09:08
I use celery to update RSS feeds in my news aggregation site. I use one @task for each feed, and things seem to work nicely. There's a detail that I'm not sure to handle well though: all feeds are updated once every minute with a @periodic_task, but what if a feed is still updating from the last periodic task when a new one is started ? (for example if the feed is really slow, or offline and the task is held in a retry loop) Currently I store tasks results and check their status like this: import socket from datetime import timedelta from celery.decorators import task, periodic_task from

Send log messages from all celery tasks to a single file

浪子不回头ぞ 提交于 2019-11-27 00:00:44
问题 I'm wondering how to setup a more specific logging system. All my tasks use logger = logging.getLogger(__name__) as a module-wide logger. I want celery to log to "celeryd.log" and my tasks to "tasks.log" but I got no idea how to get this working. Using CELERYD_LOG_FILE from django-celery I can route all celeryd related log messages to celeryd.log but there is no trace of the log messages created in my tasks. 回答1: Note: This answer is outdated as of Celery 3.0, where you now use get_task

Retry Celery tasks with exponential back off

為{幸葍}努か 提交于 2019-11-26 23:59:56
问题 For a task like this: from celery.decorators import task @task() def add(x, y): if not x or not y: raise Exception("test error") return self.wait_until_server_responds( if it throws an exception and I want to retry it from the daemon side, how can apply an exponential back off algorithm, i.e. after 2^2, 2^3,2^4 etc seconds? Also is the retry maintained from the server side, such that if the worker happens to get killed then next worker that spawns will take the retry task? 回答1: The task

How to use Flask-SQLAlchemy in a Celery task

旧城冷巷雨未停 提交于 2019-11-26 23:49:45
问题 I recently switch to Celery 3.0. Before that I was using Flask-Celery in order to integrate Celery with Flask. Although it had many issues like hiding some powerful Celery functionalities but it allowed me to use the full context of Flask app and especially Flask-SQLAlchemy. In my background tasks I am processing data and the SQLAlchemy ORM to store the data. The maintainer of Flask-Celery has dropped support of the plugin. The plugin was pickling the Flask instance in the task so I could

Retrieve list of tasks in a queue in Celery

做~自己de王妃 提交于 2019-11-26 23:48:06
How can I retrieve a list of tasks in a queue that are yet to be processed? semarj EDIT: See other answers for getting a list of tasks in the queue. You should look here: Celery Guide - Inspecting Workers Basically this: >>> from celery.task.control import inspect # Inspect all nodes. >>> i = inspect() # Show the items that have an ETA or are scheduled for later processing >>> i.scheduled() # Show tasks that are currently active. >>> i.active() # Show tasks that have been claimed by workers >>> i.reserved() Depending on what you want if you are using rabbitMQ, use this in terminal: sudo

Running Scrapy spiders in a Celery task

﹥>﹥吖頭↗ 提交于 2019-11-26 23:38:37
I have a Django site where a scrape happens when a user requests it, and my code kicks off a Scrapy spider standalone script in a new process. Naturally, this isn't working with an increase of users. Something like this: class StandAloneSpider(Spider): #a regular spider settings.overrides['LOG_ENABLED'] = True #more settings can be changed... crawler = CrawlerProcess( settings ) crawler.install() crawler.configure() spider = StandAloneSpider() crawler.crawl( spider ) crawler.start() I've decided to use Celery and use workers to queue up the crawl requests. However, I'm running into issues with

Cancel an already executing task with Celery?

孤街醉人 提交于 2019-11-26 23:37:25
I have been reading the doc and searching but cannot seem to find a straight answer: Can you cancel an already executing task? (as in the task has started, takes a while, and half way through it needs to be cancelled) I found this from the doc at Celery FAQ >>> result = add.apply_async(args=[2, 2], countdown=120) >>> result.revoke() But I am unclear if this will cancel queued tasks or if it will kill a running process on a worker. Thanks for any light you can shed! mher revoke cancels the task execution. If a task is revoked, the workers ignore the task and do not execute it. If you don't use

Understanding celery task prefetching

旧街凉风 提交于 2019-11-26 21:54:54
I just found out about the configuration option CELERYD_PREFETCH_MULTIPLIER ( docs ). The default is 4, but (I believe) I want the prefetching off or as low as possible. I set it to 1 now, which is close enough to what I'm looking for, but there's still some things I don't understand: Why is this prefetching a good idea? I don't really see a reason for it, unless there's a lot of latency between the message queue and the workers (in my case, they are currently running on the same host and at worst might eventually run on different hosts in the same data center). The documentation only mentions

Run a Scrapy spider in a Celery Task

非 Y 不嫁゛ 提交于 2019-11-26 21:44:18
This is not working anymore , scrapy's API has changed. Now the documentation feature a way to " Run Scrapy from a script " but I get the ReactorNotRestartable error. My task: from celery import Task from twisted.internet import reactor from scrapy.crawler import Crawler from scrapy import log, signals from scrapy.utils.project import get_project_settings from .spiders import MySpider class MyTask(Task): def run(self, *args, **kwargs): spider = MySpider settings = get_project_settings() crawler = Crawler(settings) crawler.signals.connect(reactor.stop, signal=signals.spider_closed) crawler

Celery. Decrease number of processes

非 Y 不嫁゛ 提交于 2019-11-26 20:55:48
问题 Is there any way around to limit number of workers in celery? I have small server and celery always creates 10 processes on 1 core processor. I want to limit this number to 3 processes. 回答1: I tried setting concurrency to 1 and max_tasks_per_child to 1 in my settings.py file and ran 3 tasks at the same time. It just spawns 1 process as a User and the other 2 as celery. It should should just run 1 process and then wait for it to finish before running the other one. I am using django celery.