[Babel-users] Losing seqnos at boot [was: Babel questions]

Juliusz Chroboczek Juliusz.Chroboczek at pps.jussieu.fr
Wed Oct 20 15:42:01 UTC 2010


>> How many is ``several minutes''?  More than 200 seconds?  Can you
>> reproduce that after decreasing SOURCE_GC_TIME in source.h?  (Note that
>> decreasing this value below 30 seconds or so will most likely cause
>> routing loops.)
>
> You're exactly right. I just saw this again a couple hours ago, saw
> 204 seconds of missed pings. A-ha! So... this would be because that
> node's sequence number is reset between boots, while its router-id
> remains unchanged.

Looks like it.  The node will only be unreachable if the new, randomly
chosen node-id is smaller than the previous one, which happens half the
time on average.

> Unfortunately, I don't have a classically writeable
> filesystem. I could serialize the babel-state file into nonvolatile
> storage myself every shutdown, but that would require mixing runtime
> with configuration data. I'm not sure how many write cycles I get, and
> rewriting the board's config all the time generally makes me nervous.

Nah, don't bother.

> A booting node's neighbors likely know its last-boot sequence
> number -- would it make sense to solicit it from one of them,

Nah, don't bother.

One workaround would be to reduce SOURCE_GC_TIME to 45 seconds; I'm fairly
positive that this will not cause any rooting loops in practice.
I don't recomment changing it to below twice the update time.

Another possibility would be to draw a new router-id at every boot,
which will cause the stale loop-avoidance data in the network not to
apply to the new incarnation.  (You can do that by commenting out lines
363 through 374 in babeld.c.)  The main flaw of this approach is that it
will make it more difficult to administer and debug your network, since
nodes will be using unstable node-ids rather than stable ids derived
from a MAC address.

Finally, if the above workarounds are not satisfactory for you, we could
consider extending the procol with a flag that says ``I've rebooted
recently, please flush any loop-avoidance data you may have for me.''
The Reserved field of the Router-Id message is just the right place to
put such a flag (Section 4.4.7 in the Babel draft).

                                        Juliusz



More information about the Babel-users mailing list