celery

Celery Consumer SQS Messages

放肆的年华 提交于 2019-12-04 19:02:27
I am new to Celery and SQS , and would like to use it to periodically check messages stored in SQS and then fire a consumer. The consumer and Celery both live on EC2 , while the messages are sent from GAE using boto library. Currently, I am confused about: In the message body of creating_msg_gae.py , what task information I should put here? I assume this information would be the name of my celery task ? In the message body of creating_msg_gae.py , is url considered as the argument to be processed by my consumer ( function do_something_url(url) in tasks.py )? Currently, I am running celery with

#SORA#celery研究笔记

删除回忆录丶 提交于 2019-12-04 19:01:06
最近看到celery文档task部分,做一下小结 实际处理时,我们可以使用一个类似于logging的模块生成日志。 对于某些任务,你可以设置当满足某些条件时,重试任务、拒绝任务或忽略任务 在定义task时,@app.task(bind=True)中的bind参数可以让你在task中访问请求中的内容,比如id,group之类的信息 @app.task(AGRS=VALUE),ARGS有好几个参数可以设置,比如name,有些和全局设置(CELERY_xxx_xxx之类的)是一样的配置内容 可以自定义任务状态(默认有pending,started,success,failure,retry,revoked) 当你使用pickle作为序列化工具时,你应该定义那些可以被pickle的异常(我用json,直接忽略) 实例化。你可以继承Task类定义新类,定义的__init__方法只会被调用一次,此后将持续存在。当你的task以此新类为基类,后面对此task的调用中,__init__的作用还会存在。(用途:自定义类中维持一个对数据库的连接,task可以不用每次都创建连接,而是对那个存在的属性进行操作)。 e.g: from celery import Task class DatabaseTask(Task): abstract = True _db = None @property def

Airflow + celery or dask. For what, when?

為{幸葍}努か 提交于 2019-12-04 17:44:17
问题 I read in the official Airflow documentation the following: What does this mean exactly? What do the authors mean by scaling out? That is, when is it not enough to use Airflow or when would anyone use Airflow in combination with something like Celery? (same for dask ) 回答1: In Airflow terminology an "Executor" is the component responsible for running your task. The LocalExecutor does this by spawning threads on the computer Airflow runs on and lets the thread execute the task. Naturally your

Problems with the new style celery api

半城伤御伤魂 提交于 2019-12-04 17:38:30
I have a class that extends celerys Task . It runs just fine with the old style API, but I am having problems converting it to the new API. # In app/tasks.py from celery import Celery, Task celery = Celery() @celery.task class CustomTask(Task): def run(self, x): try: # do something except Exception, e: self.retry(args=[x], exc=e) And then I run the task like so - CustomTask().apply_async(args=[x], queue='q1') And I get the error - TypeError: run() takes exactly 2 arguments (1 given) This SO answer seems to do the same thing and it was accepted so presumably it works. Can anyone help me out and

Mixing django-celery and standalone celery

穿精又带淫゛_ 提交于 2019-12-04 17:07:51
We are running a website built with Django and Piston and I want to implement celery to offload tasks to an external server. I don't really want to run Django on the secondary server and would like to simply run a pure Python celery worker. Is it possible for me to write simple function stubs on the Django server and write the actual function logic on the secondary server? i.e. Django Side from celery import task @task send_message(fromUser=None, toUser=None, msgType=None, msg=None): pass Server Side from celery import Celery celery = Celery('hello', broker='amqp://guest@localhost//') @celery

Use Python standard logging in Celery

纵然是瞬间 提交于 2019-12-04 17:04:12
问题 I have to implement Celery in a pre-existing system. The previous version of the system already used Python standard logging. My code is similar to the code below. Process one and process two are non-Celery functions, which are logging everywhere. We are using the logging to track data loss if something bad happens. @task def add(x,y): process_one(x,y) process_two(x,y) How can I implement Celery and use the Python standard logging instead Celery logging, so our old logging system is not lost?

Celery SQS + Duplication of tasks + SQS visibility timeout

你说的曾经没有我的故事 提交于 2019-12-04 17:01:33
Most of my Celery tasks have ETA longer then maximal visibility timeout defined by Amazon SQS. Celery documentation says: This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop. So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use. At the same time it also says that: The maximum visibility timeout supported by AWS as of this writing is 12 hours (43200 seconds): What should I do to avoid multiple execution of tasks in

Can i use luigi with Python celery

雨燕双飞 提交于 2019-12-04 16:52:53
I am using celery for my web application. Celery executes Parent tasks which then executes further pipline of tasks The issues with celery I can't get dependency graph and visualizer i get with luigi to see whats the status of my parent task Celery does not provide mechanism to restart the failed pipeline and start from where it failed. These two thing i can easily get from luigi. So i was thinking that once celery runs the parent task then inside that task i execute the Luigi pipleine. Is there going to be any issue with that i.e i need to autoscale the celery workers based on queuesize .

Celery: stuck in infinitly repeating timeouts (Timed out waiting for UP message)

a 夏天 提交于 2019-12-04 16:23:13
I defined some tasks with a time limit of 1200: @celery.task(time_limit=1200) def create_ne_list(text): c = Client() return c.create_ne_list(text) I'm also using the worker_process_init signal to do some initialization, each time a new process starts: @worker_process_init.connect def init(sender=None, conf=None, **kwargs): init_system(celery.conf) init_pdf(celery.conf) This initialization function takes several seconds to execute. Besides that, I'm using the following configuration: CELERY_RESULT_SERIALIZER = 'json' CELERY_TASK_SERIALIZER = 'json' CELERY_ACCEPT_CONTENT = ['json'] BROKER_URL =

分布式任务队列Celery入门与进阶

这一生的挚爱 提交于 2019-12-04 16:05:19
一、简介   Celery是由Python开发、简单、灵活、可靠的分布式任务队列,其本质是生产者消费者模型,生产者发送任务到消息队列,消费者负责处理任务。Celery侧重于实时操作,但对调度支持也很好,其每天可以处理数以百万计的任务。特点: 简单:熟悉celery的工作流程后,配置使用简单 高可用:当任务执行失败或执行过程中发生连接中断,celery会自动尝试重新执行任务 快速:一个单进程的celery每分钟可处理上百万个任务 灵活:几乎celery的各个组件都可以被扩展及自定制 应用场景举例:   1.web应用:当用户在网站进行某个操作需要很长时间完成时,我们可以将这种操作交给Celery执行,直接返回给用户,等到Celery执行完成以后通知用户,大大提好网站的并发以及用户的体验感。   2.任务场景:比如在运维场景下需要批量在几百台机器执行某些命令或者任务,此时Celery可以轻松搞定。   3.定时任务:向定时导数据报表、定时发送通知类似场景,虽然Linux的计划任务可以帮我实现,但是非常不利于管理,而Celery可以提供管理接口和丰富的API。 二、架构&工作原理   Celery由以下三部分构成:消息中间件(Broker)、任务执行单元Worker、结果存储(Backend),如下图:    工作原理: 任务模块Task包含异步任务和定时任务。其中