芹菜:如何限制队列中的任务数量并在满时停止进食?

问题描述:

我是Celery的新手,这是我的问题:

I am very new to Celery and here is the question I have:

假设我有一个脚本,该脚本经常应该从数据库中获取新数据并使用Celery发送给工作人员.

Suppose I have a script that is constantly supposed to fetch new data from DB and send it to workers using Celery.

tasks.py

# Celery Task
from celery import Celery

app = Celery('tasks', broker='amqp://guest@localhost//')

@app.task
def process_data(x):
    # Do something with x
    pass

fetch_db.py

fetch_db.py

# Fetch new data from DB and dispatch to workers.
from tasks import process_data

while True:
    # Run DB query here to fetch new data from DB fetched_data

    process_data.delay(fetched_data)

    sleep(30);

这是我关心的问题:每30秒获取一次数据. process_data()函数可能需要更长的时间,并且根据工作人员的数量(尤其是如果数量太少),据我所知,队列可能会受到限制.

Here is my concern: the data is being fetched every 30 seconds. process_data() function could take much longer and depending on the amount of workers (especially if too few) the queue might get throttled as I understand.

  1. 我不能增加工人数.
  2. 我可以修改代码,以免在队列已满时不接收队列.

问题是如何设置队列大小以及如何知道队列已满?一般来说,如何处理这种情况?

The question is how do I set queue size and how do I know it is full? In general, how to deal with this situation?

您可以设置 rabbitmq x-max-length 使用 kombu

示例:

import time
from celery import Celery
from kombu import Queue, Exchange

class Config(object):
    BROKER_URL = "amqp://guest@localhost//"

    CELERY_QUEUES = (
        Queue(
            'important',
            exchange=Exchange('important'),
            routing_key="important",
            queue_arguments={'x-max-length': 10}
        ),
    )

app = Celery('tasks')
app.config_from_object(Config)


@app.task(queue='important')
def process_data(x):
    pass

或使用政策

rabbitmqctl set_policy Ten "^one-meg$" '{"max-length-bytes":1000000}' --apply-to queues