[Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates

Juliusz Chroboczek jch at pps.univ-paris-diderot.fr
Fri Jun 10 18:47:34 UTC 2016


Dear Kirill,

Thank you very much for the detailed analysis.

If I read you correctly, this looks like a kernel bug: incorrect
invalidation of the route cache.  While we have seen some similar bugs in
earlier kernel versions, they were not triggered by something that
simple -- you needed to do some non-trivial rule manipulation in order to
trigger them.

What is more -- I believe that babeld is using the same procedure as
Quagga and Bird.  Do you understand why Quagga and Bird are not seeing the
same issues ?

While I have no objection to switching to a different API for manipulating
routes, I'd like to first make sure that we understand what's going on here.

Oh -- and are you running a stock kernel, or one locally patched?  Can you
reproduce the issue on a pristine, recent kernel?

Thanks again for your help,

-- Juliusz







More information about the Babel-users mailing list