[Debian-ha-maintainers] Bug#701913: cluster-agents: /usr/lib/ocf/resource.d/heartbeat/SendArp starts send_arp once, doesn't monitor.

Tim Small tim at seoss.co.uk
Thu Feb 28 14:41:30 UTC 2013


Package: cluster-agents
Version: 1:1.0.3-3.1
Severity: normal

The behaviour of

/usr/lib/ocf/resource.d/heartbeat/SendArp

is such that when the $ARP_BACKGROUND option is given, the return code
of the send_arp binary is never checked on start.  The monitor option
unconditionally returns  $OCF_SUCCESS

I've partly re-written the script (attached) to address these issues.

I've used start-stop-daemon for expediency (and correctness!), but I'm
not sure if the resulting script is any use upstream as a result?

The main motivation for these fixes were the following two cases:

1. Node A claims IP address, starts SendArp resource.
2. Node B claims same IP address (erroneously).
3. Node B error is fixed, and IP address removed from node B.

Result with old script: node A doesn't receive any IP traffic for the IP
address in question until router ARP cache times out (can be many hours
with some router brands e.g. some Ciscos).


1. Node A claims IP address.
2. Node A goes away (e.g. power failure etc.).
3. Network connectivity between Node B and router temporarily
interupted.
4. Node B claims IP address, starts SendArp resource.
5. Network connectivity between Node B and router is restored.

Result with old script: node B doesn't receive any IP traffic for the IP
address in question until router ARP cache times out (can be many hours
with some router brands).


In both cases improved behaviour is observed with the new script.

-- System Information:
Debian Release: 6.0.6
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash



More information about the Debian-ha-maintainers mailing list