在pika / RabbitMQ中处理长时间运行的任务

Question

我们正在尝试建立一个基本的有向队列系统，其中生产者将生成多个任务，一个或多个消费者将一次获取任务，处理它并确认该消息。

问题是，处理可能需要10-20分钟，而我们当时没有响应消息，导致服务器断开连接。

这是我们的消费者的一些伪代码：

#!/usr/bin/env python
import pika
import time

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)
print ' [*] Waiting for messages. To exit press CTRL+C'

def callback(ch, method, properties, body):
    long_running_task(connection)
    ch.basic_ack(delivery_tag = method.delivery_tag)

channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback,
                      queue='task_queue')

channel.start_consuming()

第一个任务完成后，会在BlockingConnection内部的某处抛出异常，抱怨套接字已重置。此外，RabbitMQ日志显示消费者因未及时响应而断开连接（为什么重置连接而不是发送FIN很奇怪，但我们不会担心这一点）。

我们搜索了很多，因为我们认为这是RabbitMQ的正常使用案例（有许多长期运行的任务应该在许多消费者中分开），但似乎没有其他人真正有这个问题。最后我们偶然发现了一个线程，建议使用心跳并在单独的线程中生成long_running_task()。

所以代码变成了：

#!/usr/bin/env python
import pika
import time
import threading

connection = pika.BlockingConnection(pika.ConnectionParameters(
        host='localhost',
        heartbeat_interval=20))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)
print ' [*] Waiting for messages. To exit press CTRL+C'

def thread_func(ch, method, body):
    long_running_task(connection)
    ch.basic_ack(delivery_tag = method.delivery_tag)

def callback(ch, method, properties, body):
    threading.Thread(target=thread_func, args=(ch, method, body)).start()

channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback,
                      queue='task_queue')

channel.start_consuming()

这似乎有效，但它非常混乱。我们确定ch对象是线程安全的吗？另外，假设long_running_task()正在使用该连接参数将任务添加到新队列（即，完成此长流程的第一部分，让我们将任务发送到第二部分）。所以，线程正在使用connection对象。这个线程安全吗？

更重要的是，这样做的首选方式是什么？我觉得这很麻烦，可能不是线程安全的，所以也许我们做得不对。谢谢！

Answer 1

现在，你最好的办法是关闭心跳，如果你长时间阻塞，这将使RabbitMQ无法关闭连接。我正在尝试在后台线程中运行pika的核心连接管理和IO循环，但它不够稳定，不能发布。

在pika v1.1.0这是ConnectionParameters(heartbeat=0)

Answer 2

我遇到了同样的问题。我的解决方案是：

关闭服务器端的心跳
评估任务可以采取的最长时间
将客户端心跳超时设置为从步骤2获取的时间

Why this?

正如我测试以下情况：

case one

服务器心跳开启，19世纪
客户未设置

当任务运行很长时间 - > 1800时，我仍然会收到错误

case two

关闭服务器心跳
关闭客户端心跳

客户端没有错误，除了一个问题 - 当客户端崩溃时（我的操作系统重新启动了一些故障），仍然可以在Rabbitmq Management插件中看到tcp连接。而且令人困惑。

case three

关闭服务器心跳
打开客户端心跳，将其设置为预见的最长运行时间

在这种情况下，我可以动态地改变不同客户的每一次热潮。事实上，我经常在经常崩溃的机器上设置心跳。此外，我可以通过Rabbitmq Manangement插件看到离线机器。

Environment

操作系统：数百x86_64 皮卡：0.9.13 rabbitmq：3.3.1

Answer 3

请不要禁用心跳！

从Pika 0.12.0开始，请使用this example code中描述的技术在单独的线程上运行长时间运行的任务，然后确认来自该线程的消息。

注意：RabbitMQ团队监控the rabbitmq-users mailing list，有时只回答StackOverflow上的问题。

Answer 4

不要禁用心跳。最好的解决方案是在一个单独的线程中运行任务，并将prefetch_count设置为1，以便消费者只使用类似这样的channel.basic_qos(prefetch_count=1)获取1个未确认的消息

Answer 5

您可以在connection.process_data_events()中定期调用long_running_task(connection)，此函数将在调用时向服务器发送心跳，并使pika客户端远离关闭。
设置心跳值大于在你的pika connection.process_data_events()中调用BlockingConnection时段。

Answer 6

您还可以设置一个新线程，并在此新线程中处理该消息，并在此线程处于活动状态时调用连接上的.sleep以防止丢失心跳。以下是从github中的@gmr获取的示例代码块，以及指向该问题的链接以供将来参考。

import re
import json
import threading

from google.cloud import bigquery
import pandas as pd
import pika
from unidecode import unidecode

def process_export(url, tablename):
    df = pd.read_csv(csvURL, encoding="utf-8")
    print("read in the csv")
    columns = list(df)
    ascii_only_name = [unidecode(name) for name in columns]
    cleaned_column_names = [re.sub("[^a-zA-Z0-9_ ]", "", name) for name in ascii_only_name]
    underscored_names = [name.replace(" ", "_") for name in cleaned_column_names]
    valid_gbq_tablename = "test." + tablename
    df.columns = underscored_names

    # try:
    df.to_gbq(valid_gbq_tablename, "some_project", if_exists="append", verbose=True, chunksize=10000)
    # print("Finished Exporting")
    # except Exception as error:
    #     print("unable to export due to: ")
    #     print(error)
    #     print()

def data_handler(channel, method, properties, body):
    body = json.loads(body)

    thread = threading.Thread(target=process_export, args=(body["csvURL"], body["tablename"]))
    thread.start()
    while thread.is_alive():  # Loop while the thread is processing
        channel._connection.sleep(1.0)
    print('Back from thread')
    channel.basic_ack(delivery_tag=method.delivery_tag)


def main():
    params = pika.ConnectionParameters(host='localhost', heartbeat=60)
    connection = pika.BlockingConnection(params)
    channel = connection.channel()
    channel.queue_declare(queue="some_queue", durable=True)
    channel.basic_qos(prefetch_count=1)
    channel.basic_consume(data_handler, queue="some_queue")
    try:
        channel.start_consuming()
    except KeyboardInterrupt:
        channel.stop_consuming()
    channel.close()

if __name__ == '__main__':
    main()
python

链接：https://github.com/pika/pika/issues/930#issuecomment-360333837

在pika / RabbitMQ中处理长时间运行的任务

问题描述投票：43回答：6

6个回答

Why this?

Environment

最新问题

在pika / RabbitMQ中处理长时间运行的任务

问题描述 投票：43回答：6

6个回答

Why this?

Environment

最新问题

问题描述投票：43回答：6