[Babel-users] Babel for use in a dense disconnected mesh network

Jason Palmer Jason.Palmer at trutest.co.nz
Fri Jun 21 01:43:51 UTC 2013


Comments inline. Apologies for the wall of text!
 
Regards,
 
JASON

> > The system consists of a number of portable devices, one acting as a
> > webserver/database server, one or more hand-held smart phones or
> > tablets, and 10-100 measurement devices
> 
> 10 or 100?
> 
> With 10 devices, ad-hoc mode will work fine.  With 100 devices all
> within range, the amount of beaconing going on in ad-hoc could very
> well make your mesh unstable.
> 
That will vary depending on the installation. Some installations may only have 10 measurement devices, some may have 100. Typical numbers will be 20-40 devices per installation - we just need to be able to accommodate the larger installations. 


> Have you checked whether the handheld devices support ad-hoc?
> 
Yes, I have looked into this, and it seems that they don't support it particularly well. I have some ideas skimmed from the Byzantium distribution that would allow two vlans to be created on the Wi-Fi network adapters - one to act as an access point for the hand-helds, and the other for the mesh, bridged together. I would only set up a few of the measurement devices as soft-AP's, otherwise it would get rather confusing for the hand-held devices attempting to roam across the AP's and will up the traffic/interference


> > - The measurement devices would be placed 1-2 metres apart, either
> >   in one or two rows or in a loop.
> 
> I'm curious about the application.
> 
Secret squirrel stuff. The application is for portable test equipment to go into milking sheds, one measurement device per set of cups. The devices are set up for a test (pm test and am test), and then moved to the next farm for testing. Our current devices use a proprietart 434 or 2.4 GHz wireless comms, though it is only point to point (and a lot of walking and manual key presses on the devices to communicate between them). I am looking to add some flexibility to the system in having remote control and real-time monitoring from a hand-held device. 802.11 on Linux appeals more than a proprietary comms stack on a proprietary embedded OS. Babel seems like a solid routing protocol to overlay on a mesh network. I have tried batman-adv, and had some instabilities: nodes were addressable at layer 2 but not layer 3, and vise versa. Still experimenting with infrastructure mode as a fallback.  


> > At this point I am not sure whether an infrastructure mode or a mesh
> > network would be better suited, so I have been experimenting with
> > babel. The main issue with infrastructure mode would be that range
> > could be an issue with only one access point.
> 
> I would use a backbone+access topology, using two frequencies.  More
> about this below.
> 

> > I have had some success with a limited number of devices (9-12) in
> > close proximity, but recent experiments with more devices (24-30)
> have
> > caused an instability and now I have managed to break the mesh more
> > frequently than I have in the past. I have a feeling that I have not
> > configured babel or the wlan adapter appropriately for this number of
> > devices, or that I may be running into the 802.11 ad-hoc limitations.
> 
> Probably the latter.
> 
> > "Warning: bucket full, dropping packet to wlan0"
> > "Warning: bucket full, dropping unicast packet to "Interface wlan0
> has
> > no link-local address"
> > "setsocketopt(IPV6_LEAVE_GROUP): Cannot assign requested address"
> > "send(unicast): Cannot assign requested address"
> > "send: Cannot assign requested address"
> 
> Are you running network manager or something similar?  These messages
> indicate that your interfaces are going up/down at a tremendous rate.
> Babel works best when the interface remains up even when connectivity
> is lost.
> 
I have a systemd service manually bringing up the interface, and then starting babeld. It is among the first system scripts that I have written, so it may be that which is not helping. I have netcfg installed, but as far as I am aware am not using it - for this networking experimentation it has caused more headaches than it has resolved. Are the ExecStop lines there to control actions when you stop the service, I assume they do not run at the completion of the ExecStart lines.  

-------------------------------------------------------
[Unit]
Description=Mesh Network
Before=network.target
Before=babeld.service
BindsTo=sys-subsystem-net-devices-wlan0.device
After=sys-subsystem-net-devices-wlan0.device
After=sys-devices-platform-bcm2708_usb-usb1-1\x2d1-1\x2d1.3-1\x2d1.3:1.0-net-wlan0.device

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/ip link set wlan0 down
ExecStart=/usr/sbin/iwconfig wlan0 mode ad-hoc channel 1 essid "babelmesh" 
ExecStart=/sbin/ip addr add 172.16.0.11 dev wlan0
ExecStart=/sbin/ip link set wlan0 up
ExecStart=/usr/bin/sleep 3

ExecStop=/sbin/ip addr flush dev wlan0
ExecStop=/sbin/ip link set dev wlan0 down

[Install]
WantedBy=network.target
-------------------------------------------------------
[Unit]
Description=babeld routing daemon
Requires=mesh.service
BindTo=mesh.service
After=mesh.service
After=network.target

[Service]
ExecStart=/usr/bin/babeld -D

[Install]
WantedBy=multi-user.target
-------------------------------------------------------
Babel is run only against the wlan0 interface


> > Are there any obvious settings that are required for babel to work in
> > dense mesh networks? Would increasing the hello interval reduce the
> > load on the network (though babel already appears to have a
> relatively
> > low overhead).
> 
> I doubt it.  If killing network manager doesn't help, I'd suggest
> switching to a different topology.  Let S be the server, M be the
> measurement nodes, and R two-radio routers.
> 
>                R---M
>               /
>              /
>             S--R---M
>              \  \
>               \  M
>                \
>                 R---R---M
> 
> Grab two or three two-radio nodes, either a nice router such as the
> WNDR3800, or simply old laptops with two radios.  Pick two distinct
> channels, call them the backbone frequency and the access frequency.
> 
> Set your server to ad-hoc mode, backbone frequency.  Set your two-radio
> routers to ad-hoc mode, backbone frequency on one radio, and AP mode,
> access frequency on the other radio.  Set your measurement nodes to
> station mode, access frequency.
> 
> The server and the routers are communicating over ad-hoc at the
> backbone frequency, while the measurement devices have associated with
> one of the routers at the access frequency.
> 
> Now run Babel on all your nodes, and you're done.  Whenever a
> measurement device switches to a different AP, Babel will reconverge
> within two or three hello intervals.  Should an area of your building
> have poor coverage, you just pop in an extra two-radio router.  Should
> your backbone become congested, you add a third radio at a different
> frequency and run Babel-Z.
> 
> -- Juliusz

Thank you, that may be a better solution. I was hoping to avoid a solution which required multiple AP's or two-radio nodes, but looking at the number of devices that may beed to be connected it would be wiser to design a system to nominally work with 50 devices and have "expansion packs" for the installations that had a larger number of devices or poor signal propagation. 
I know hanging that many devices off one or two AP's is against convention, but the data requirements are relatively low: 2s status updates (<100 bytes/device), 3 to 5 min animal records (<1500 bytes/device/animal), end-of-session records (<5000 bytes/device/session). The main concern is issuing control signals from the hand-held devices over the top of the base traffic without suffering huge latencies. 



More information about the Babel-users mailing list