[Babel-users] About split horizon in Babel, and scaling the number of neighbours

Fri Jun 20 14:28:05 UTC 2014

> As far as I know, split horizon is usually done by routing protocols in
> order to prevent the most common cases of routing loops.
>
> Babel also implements split horizon, but the goal doesn't seem to be loop
> prevention — probably because it has more sophisticated ways of preventing
> routing loops.  Instead, split horizon is an optimisation performed when
> an interface is known to be transitive.

Exactly.  Consider the following topology:

     Big network  .....  GW  ----  Small network

where the link between the big network and the gateway is very slow.
Without split horizon, the whole routing table of the big network is
echoed back over the slow link.  With split horizon, only the routing
table of the small network is sent over the slow link.

Note that this does not apply to split horizon with poison reverse.

> Thus my question: how well does Babel perform when having many neighbours
> on the same interface, with and without split horizon?

Not very well.  If there are n Babel speakers on a single LAN segment, the
routing table will be sent n times, once by each speaker, on every
periodic update.  While you can work around this issue with some smart
filtering, this is an intrinsic limitation of distance vector routing.

(OSPF has a complex algorithm where a single "designated router" (DR) is
elected on each network segment, and only the DR performs flooding.  Of
course, then the DR is a single point of failure, so OSPF has another
complex algorithm where a "backup designated router" (BDR) is elected,
monitors the DR, and becomes calife à la place du calife when the DR fails.)

> - control traffic as a function of the number of neighbouring nodes on the
> same interface (with and without split horizon).  I would expect it to
> be linear with split horizon and quadratic without split horizon, does
> that sound right?

No, it's linear in both cases: number of neighbours * size of the FIB.
Recall that only installed routes are announced.

Split horizon saves up to 1/2 the traffic in extreme cases such as the one
described above.  Typical savings are much less.

> - hard limit on the number of neighbours, with and without split horizon
> (what would be the bottleneck?  CPU usage?  network throughput?
> possible implementation limitations?)

With wifi in ad-hoc mode, the link layer is going to collapse around 100
neighbours.

Over ethernet, we apply enough jitter to Babel messages to avoid most
collisions, so I'd expect the link layer to survive well into the
thousands.  Babel is able to fit roughly 60 routes in one Ethernet frame
(assuming a 50-50 split between IPv4 and IPv6), so with 1000 routes and
100 neighbours you'll see around 1500 packets every 20 seconds.

You'll see issues with packet bursts when a route disappears and nodes
start sending requests.  You'll probably also have issues with the
neighbour table being a flat list.  Both of which should be fixable.

> On the other hand, what could go wrong if split horizon is enabled?
> Will some routes fail to propagate?

As mentioned above -- split horizon will not gain you much in a rich
meshed topology.  Split horizon on a non-transitive network will cause
some routes to be missing; however, except for the resulting blackholes,
nothing bad should happen.

Consider the following topology:

      B
      |
  A --+-- C

A and B can communicate, B and C can communicate, but C doesn't hear A.
With split horizon, the routes announced by A are not reannounced by B, so
C never learns the routes to A.

If your topology is rich enough, then C probably has an alternate route to
A, so the missing route is not an issue.  However, I would recommend
disabling split horizon.  If you run into excessive routing traffic,
either manually splitting your links or manual filtering rules are likely
to yield better results.

-- Juliusz