[Babel-users] [BUG] Route "deadlocks" under load due to non-atomic kernel route updates

Dave Taht dave.taht at gmail.com
Sat Jun 11 18:26:48 UTC 2016


On Fri, Jun 10, 2016 at 11:47 AM, Juliusz Chroboczek
<jch at pps.univ-paris-diderot.fr> wrote:
> Dear Kirill,
>
> Thank you very much for the detailed analysis.
>
> If I read you correctly, this looks like a kernel bug: incorrect
> invalidation of the route cache.  While we have seen some similar bugs in
> earlier kernel versions, they were not triggered by something that
> simple -- you needed to do some non-trivial rule manipulation in order to
> trigger them.
>
> What is more -- I believe that babeld is using the same procedure as
> Quagga and Bird.  Do you understand why Quagga and Bird are not seeing the
> same issues ?

Quagga, at least, switched to atomic updates some time ago, I think.

http://patchwork.quagga.net/patch/1234/

>
> While I have no objection to switching to a different API for manipulating
> routes, I'd like to first make sure that we understand what's going on here.

I strongly approve of atomic updates and fixing what, if anything,
that breaks...

I have seen oddities in unreachable p2p routes for years now. I've
suspected a variety of causes - notably getting a icmp route
unreachable before babel could make the switch, but have never tracked
it down. Some of the work I'm doing now could be leveraged to try and
make it happen more often, but a few more pieces on top of this

https://www.mail-archive.com/netdev@vger.kernel.org/msg114172.html

need to land before I can propagate all the right pieces to the testbed.

>
> Oh -- and are you running a stock kernel, or one locally patched?  Can you
> reproduce the issue on a pristine, recent kernel?
>
> Thanks again for your help,
>
> -- Juliusz
>
>
>
>
>
> _______________________________________________
> Babel-users mailing list
> Babel-users at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org



More information about the Babel-users mailing list