We\'re trying to set up a basic directed queue system where a producer will generate several tasks and one or more consumers will grab a task at a time, process it, and ackn
connection.process_data_events()
in your long_running_task(connection)
, this function will send heartbeat to server when it is been called, and keep the pika client away from close.connection.process_data_events()
period in your pika BlockingConnection
.Please don't disable heartbeats!
As of Pika 0.12.0
, please use the technique described in this example code to run your long-running task on a separate thread and then acknowledge the message from that thread.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
Don't disable heartbeat.
The best solution is to run the task in a separate thread and , set the prefetch_count
to 1
so that the consumer only gets 1 unacknowledged message
using something like this channel.basic_qos(prefetch_count=1)
I encounter the same problem you had.
My solution is:
As i test with the following cases:
case oneI still get error when task running for a very long time -- >1800
case twoThere is no error on client side, except one problem--when the client crashes(my os restart on some faults), the tcp connection still can be seen at the Rabbitmq Management plugin. And it is confusing.
case threeIn this case, i can dynamic change every heatbeat on indivitual client. In fact, i set heartbeat on the machines crashed frequently.Moreover, i can see offline machine through the Rabbitmq Manangement plugin.
OS: centos x86_64
pika: 0.9.13
rabbitmq: 3.3.1
For now, your best bet is to turn off heartbeats, this will keep RabbitMQ from closing the connection if you're blocking for too long. I am experimenting with pika's core connection management and IO loop running in a background thread but it's not stable enough to release.
In pika v1.1.0 this is ConnectionParameters(heartbeat=0)
You can also set up a new thread, and process the message in this new thread, and call .sleep
on the connection while this thread is alive to prevent missing heartbeats. Here is a sample code block taken from @gmr in github, and a link to the issue for future reference.
import re
import json
import threading
from google.cloud import bigquery
import pandas as pd
import pika
from unidecode import unidecode
def process_export(url, tablename):
df = pd.read_csv(csvURL, encoding="utf-8")
print("read in the csv")
columns = list(df)
ascii_only_name = [unidecode(name) for name in columns]
cleaned_column_names = [re.sub("[^a-zA-Z0-9_ ]", "", name) for name in ascii_only_name]
underscored_names = [name.replace(" ", "_") for name in cleaned_column_names]
valid_gbq_tablename = "test." + tablename
df.columns = underscored_names
# try:
df.to_gbq(valid_gbq_tablename, "some_project", if_exists="append", verbose=True, chunksize=10000)
# print("Finished Exporting")
# except Exception as error:
# print("unable to export due to: ")
# print(error)
# print()
def data_handler(channel, method, properties, body):
body = json.loads(body)
thread = threading.Thread(target=process_export, args=(body["csvURL"], body["tablename"]))
thread.start()
while thread.is_alive(): # Loop while the thread is processing
channel._connection.sleep(1.0)
print('Back from thread')
channel.basic_ack(delivery_tag=method.delivery_tag)
def main():
params = pika.ConnectionParameters(host='localhost', heartbeat=60)
connection = pika.BlockingConnection(params)
channel = connection.channel()
channel.queue_declare(queue="some_queue", durable=True)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(data_handler, queue="some_queue")
try:
channel.start_consuming()
except KeyboardInterrupt:
channel.stop_consuming()
channel.close()
if __name__ == '__main__':
main()
The link: https://github.com/pika/pika/issues/930#issuecomment-360333837