[Nut-upsdev] Getting 'Data stale' error with bcmxcp_usb for a PowerWare 5115 on OSX

Charles Lepple clepple at gmail.com
Mon Mar 22 13:15:14 UTC 2010


On Mar 22, 2010, at 8:53 AM, Charlie Garrison wrote:

> Good evening,
>
> On 22/03/10 at 7:55 AM -0400, Charles Lepple <clepple at gmail.com>  
> wrote:
>
>> On Mon, Mar 22, 2010 at 7:12 AM, Charlie Garrison <garrison at zeta.org.au 
>> > wrote:
>>
>> For the portion of the log quoted above, I admit I am not familiar
>> enough with this driver to say whether it has failed at this point.
>> Maybe someone from Eaton can comment on that.
>
> Is that a hint that I should be contacting them, or do we know they  
> have devs on this list?

Arnaud works for Eaton, but I don't specifically know who has access  
to PowerWare devices for testing.

>>> 454959.198654   Warning: excessive comm failures, limiting error  
>>> reporting
>>> 454959.198672   Communications with UPS lost: Error executing  
>>> command
>>
>> The preceding two lines are generated from nutusb_comm_fail() in
>> bcmxcp_usb.c. What do you get from 'grep "Communications with UPS
>> lost" name-of-logfile' ?
>
> $ grep "Communications with UPS lost" /var/log/nut-driver.log
> 10578.415713    Communications with UPS lost: get_answer: checksum  
> error!
> 11180.245313    Communications with UPS lost: get_answer: checksum  
> error!
> 38876.625676    Communications with UPS lost: get_answer: checksum  
> error!
> 46843.166524    Communications with UPS lost: get_answer: checksum  
> error!
> 47954.281275    Communications with UPS lost: get_answer: checksum  
> error!
> 52548.325592    Communications with UPS lost: get_answer: checksum  
> error!
> 55334.000485    Communications with UPS lost: get_answer: checksum  
> error!
> 69408.920215    Communications with UPS lost: get_answer: checksum  
> error!
> 73109.467953    Communications with UPS lost: get_answer: checksum  
> error!
> 81290.330831    Communications with UPS lost: get_answer: checksum  
> error!
> 205872.380255   Communications with UPS lost: get_answer: checksum  
> error!
> 281913.375308   Communications with UPS lost: get_answer: checksum  
> error!
> 394369.162435   Communications with UPS lost: get_answer: checksum  
> error!
> 454959.198672   Communications with UPS lost: Error executing command
> 454979.199631   Communications with UPS lost: Error executing command
>
> Note, the above possibly includes entries from the previous kill/ 
> restart, not just the last one. Although I'm pretty sure I rotated  
> the log file last time, so that should be from one run of the driver.

So the checksum errors were occurring while other aspects of the  
driver seemed to be working properly?

>>> Does anyone have suggestions on how I can get the driver working  
>>> on my
>>> system? IOW, any ideas on how it can recover without me having to
>>> dis/connect the USB cable and kill/restart the driver?
>>
>> So if I remember from your previous emails, killing and restarting  
>> the
>> driver without reconnecting the USB cable does /not/ solve the
>> problem?
>
> That is correct. And this time I was testing whether *only* dis/ 
> connecting the cable would allow the driver to recover. It did allow  
> the loop processing to continue, but it didn't properly recover. I  
> had to kill the driver daemon as well.
>
>> That sounds like an issue with the firmware on the UPS
>> itself. There is a function to reset the device that we could try,  
>> but
>> I think we may need to add some more debugging to figure out what
>> error codes should trigger this:
>>
>> http://libusb.sourceforge.net/doc/function.usbreset.html
>
> Sorry, I'm missing the relevance here. Are you suggesting a one-time  
> reset? Or should I add usb_reset somewhere in bcmxcp_usb.c? (My C  
> skills aren't good enough to add that command and then open the  
> device again.)

I guess that's more-or-less a reminder to me, once we find out the  
difference between occasional benign errors (maybe including the  
checksum error mentioned above) and the non-recoverable errors.

>> Did you have a debug statement around lines 150-160 in your code? I
>> would have thought we would see the error codes from
>> usb_interrupt_read().
>
> I did, but I removed them to reduce verbosity. I'll uncomment &  
> recompile and run again.

At some point, I would like to reorganize the debug levels so that  
it's just a matter of passing fewer "-D" options to the driver. Right  
now, the debug levels seem a bit haphazard to me (but maybe that's  
just because I am not as familiar with this driver).




More information about the Nut-upsdev mailing list