Problem: getifaddrs can fail with ECONNREFUSED

getifaddrs() can fail transiently with ECONNREFUSED on Linux.
This has been observed with Linux 3.10 when multiple processes
call zmq::tcp_address_t::resolve_nic_name() simultaneously.

Before asserting in this case, make 10 attempts, with exponential
backoff, given by (1 msec * 2^i), where i is the attempt number.

Fixes #2051
This commit is contained in:
Jim Garlick 2016-07-20 10:15:55 -07:00
parent e8116b1f1c
commit 565892f3cd

View File

@ -179,6 +179,7 @@ int zmq::tcp_address_t::resolve_nic_name (const char *nic_, bool ipv6_, bool is_
&& defined ZMQ_HAVE_IFADDRS)
#include <ifaddrs.h>
#include <unistd.h>
// On these platforms, network interface name can be queried
// using getifaddrs function.
@ -186,7 +187,15 @@ int zmq::tcp_address_t::resolve_nic_name (const char *nic_, bool ipv6_, bool is_
{
// Get the addresses.
ifaddrs *ifa = NULL;
const int rc = getifaddrs (&ifa);
int rc;
const int max_attempts = 10;
const int backoff_msec = 1;
for (int i = 0; i < max_attempts; i++) {
rc = getifaddrs (&ifa);
if (rc == 0 || (rc < 0 && errno != ECONNREFUSED))
break;
usleep ((backoff_msec << i) * 1000);
}
errno_assert (rc == 0);
zmq_assert (ifa != NULL);