[Babel-users] RTT-aware branch of Babel

Baptiste Jonglez baptiste.jonglez at ens-lyon.fr
Thu Jun 20 13:51:44 UTC 2013


Dear Babel users and hackers,

I'm pleased to announce the availability of my RTT-aware branch of
babeld, which I've developed as part of an internship mentored by
Juliusz and Matthieu. The code is available in the 'babeld-rtt' branch
of the main repository:

  git clone -b babeld-rtt git://git.wifi.pps.univ-paris-diderot.fr/babeld.git

The motivation behind this feature is the following: when running
babeld on an overlay network, using tunnels or VPNs, there is
currently no way to distinguish between "good" links and "bad" links.
Unless configured manually, all links are seen as wired and have a
fixed metric (96 by default).  For instance, babeld may use multiple
long-range links to reach geographically close nodes, if the hop count
is lower.

A natural metric in such an environment is the RTT: while relatively
cheap to measure, it allows to easily spot "long" virtual links over
the Internet, which are likely to be less reliable and more expensive.

My implementation computes the RTT between neighbours by piggybacking
timestamps on Hello and IHU messages.  This achieves a very low
overhead on control traffic: about 5 additional bytes per message
(Hello or IHU), which is negligible in most cases.  Also, the time at
which messages are sent on the wire is not modified.  More
specifically, we are able to compute the RTT even if a neighbour
doesn't reply immediately, with an algorithm due to Dave Mills — which
was used in the HELLO protocol as well as being a basis for NTP.

The RTT samples are smoothed out using a moving exponential average,
and then used to compute a metric, which is added to the regular
metric associated to a neighbour.


From a theoretical point of view, there is a known issue with dynamic
metrics, such as the RTT: stability.  Since a dynamic metric depends
on the real-time condition of the network, it can leads to persistent
routing oscillations.  For instance, if a link is highly loaded, the
RTT will increase, which will eventually lead installed routes to move
away from the link.  But the RTT will then decrease, and the best
routes will use the link again, and so on.

However, our early tests show that the smoothing (both on RTT samples
and then on the metric itself, introduced in the 1.4.x branch) is
enough to prevent such oscillations from being a real issue.  I'm
currently investigating more thorough tests and theory, along with
more sophisticated techniques to further avoid stability issues.


How to use in practice
======================

The additional metric is computed from the RTT using three parameters:
'rtt-min', 'rtt-max' and 'max-rtt-penalty'.  The metric is piecewise
affine between 'rtt-min' and 'rtt-max'.  Below 'rtt-min', the value is 0;
above 'rtt-max', the value is 'max-rtt-penalty'.

An example configuration might look like the following:

  default rtt-min 10            # in milliseconds
  default rtt-max 120           # in milliseconds
  default max-rtt-penalty 150   # Maximum additional metric

which would, for each interface, increase the metric — up to 150 —
depending on the RTT to a neighbour.  For a RTT lower than 10 ms,
nothing is done; for a RTT higher than 120 ms, the metric associated
to the neighbour is increased by 150.


A maximum penalty of 150 is reasonable in a wired environment, as it
will favour lower-RTT paths in the general case; in extreme cases, it
will even allow itself an additional hop in order to avoid a high-RTT
link.


Please experiment and report any issue, or interesting real-world
behaviour, that you might encounter.

Baptiste
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/babel-users/attachments/20130620/766b3446/attachment.sig>


More information about the Babel-users mailing list