I am testing the following scenario in Spring AMQP v1.4.2 and it fails to reconnect after a network disruption:
We are also facing this issue in our production environment as well, might be because of Rabbit nodes running as VMs on different ESX racks etc. The workaround that we found was to have our client app continuously keep trying to reconnect if it gets disconnected from cluster. Below are the settings we applied and it worked:
false
This attribute changes the global behavior of Spring AMQP when declaring queues fails for fatal errors (broker not available etc). By default container tries 3 times only (see log message showing "retries left=0").
Ref: http://docs.spring.io/spring-amqp/reference/htmlsingle/#containerAttributes
In addition, we added recovery-interval so that container recovers from non-fatal errors. However, the same config is also used when global behavior is to retry for fatal errors too (like missing queues).
....