[Nut-upsdev] [PATCH] disable nonblocking mode on serial port

Jim Paris jim at jtan.com
Fri Sep 26 15:37:19 UTC 2008


> >I got a new Cyberpower 1500AVR UPS and nut wouldn't work.  It failed
> >to detect the UPS and a strace showed that all writes to the serial
> >port were failing with -EAGAIN.  The attached patch disabled
> >nonblocking mode on the serial port and now it works fine.
> 
> What operating system are you using? Which NUT driver? If you're using  
> the 'cyberpower' driver, switch to the 'powerpanel' driver and you can  
> probably skip the remainder of this message.

This is Linux 2.6.24 on amd64 with 'cyberpower'.  The serial
connection is through a USB HID->COM converter driven by cypress_m8.
I expect that any difference in behavior between my system and others
is due to the serial driver, although I do not believe that the driver
behavior is at fault here.

I can try powerpanel next time I'm around that machine (I'm out of
town at the moment and the UPS is not connected).


> The thing you should ask yourself is why the connection is timing out  
> in the first place. Running the driver in debug mode might reveal what  
> is going on. If it is because we are sending it too much data and the  
> UPS can't keep up with the data flow, this should be fixed. The same  
> may happen if synchronization is lost, which is a frequent problem  
> with drivers that don't properly flush in- and outputs.

It's not a timeout, and I already know what's going on.  It's the very
first _write_ to the device failing immediately with -EAGAIN.  I
verified this with strace.  This is independent of whether the device
is connected or not and has nothing to do with whether the device is
responding.

> Using the sledgehammer approach of opening the device in blocking mode  
> and just sit it out until it can handle whatever we throw at it is  
> definitely not what we want and certainly not something that we will  
> integrate into NUT without ruling out the problems already mentioned.

Sure, blocking mode has its own problems when trying to read from a
device that isn't sending anything back.  But this is a write over
rs232, so it's independent of the device behavior.


> >Is there a reason the port was put in non-blocking mode?
> 
> Of course. :-)
> 
> >There's no
> >code I could find to deal with the inevitable -EAGAIN that you'd
> >receive on writes.  Maybe the delays introduced by ser_send_pace are
> >why it might work for some people?
> 
> Opening a port in non-blocking mode allows us to deal with errors that  
> we may encounter (in a timely manner). If you open a device in  
> blocking mode, it may take a relatively long time before we notice  
> that something is wrong for some error conditions. The timeout depends  
> on the system settings, but 15 minutes is not uncommon.

Are you talking about during the open() itself?  I'm aware of the
reasons for using O_NONBLOCK during an open, which is why my patch
retains O_NONBLOCK there and switches it off later.


> While this will be dealt with while reading data (through select()  
> calls, so these never risk blocking), this is not the case when  
> writing to the device. If the UPS happens to be on battery at that  
> time, it will probably be empty by the time the call times out and the  
> system will be shutdown hard.

But writes to a rs232 serial port will complete in finite time
regardless of whether the device is paying attention to them or not.
What timeout do you mean?


> Your patch makes little sense, since just removing O_NONBLOCK from the  
> flags in the call to open() should be sufficient already.

Using O_NONBLOCK during open is different than using O_NONBLOCK for a
write.  I'm not trying to change the open() behavior.


> However, if  
> we should ever decide that in order to use the Cyberpower 1500AVR UPS  
> we need to open it in blocking mode, clearing the O_NONBLOCK flag  
> should be done by the driver. By doing it here, you risk breaking all  
> other drivers that use this library.

It's wrong to open a device with O_NONBLOCK and then write to it and
expect it to always succeed.  One needs to at deal with EAGAIN.  If we
can agree on that we can figure out a solution that we're both happy
with :)

One solution would be to get rid of O_NONBLOCK as I did.  Another
would be to put the write() in a loop checking for EAGAIN.  A third is
to put a select() before the write that checks for writability.  I can
whip up a patch to do any of these, they would all fix my issue.

-jim



More information about the Nut-upsdev mailing list