[Nut-upsdev] Spurious messages on start

Mon Nov 20 10:17:04 CET 2006

Arjen de Korte wrote:
> 
> Peter Selinger wrote:
> 
> > I don't understand the issues that are involved in the start-up
> > sequence very well. Could you summarize the current mechanism?
> 
> Currently, upsd connects to the driver(s), sends the DUMPALL command and
> waits for it to complete (up to MAXINIT seconds). For some UPSes with a
> large amount of data (or slow) to dump this may take a considerable amount
> of time. During that time, upsmon will not be running as it should not be
> started before upsd forks to the background. Monitoring will therefor not
> be available for any UPS, not even for the ones that have already
> completed the DUMPALL command. I don't think that is a good idea.
> 
> > From what I understand, the driver, upsd, and upsmon all fork to the
> > background, but only after checking that they have started up without
> > errors. The purpose is so that start-up scripts can check their return
> > value and react to or report any errors as necessary.
> 
> This is good. Permissions problems for instance, or missing drivers can be
> reported much better when we still have the console to write errors to.
> 
> > Wouldn't sending a fake INIT status shortcut the ability to detect any
> > errors during driver startup?
> 
> Not really. We do check if the driver socket is available and readable.
> This should cover most configuration problems. 

A readable socket doesn't guarantee that there is a UPS attached, or
that the UPS is the correct one for the driver. Perhaps it's not
necessary to wait for a dump of all the data, but perhaps one should
wait until the driver has talked to the UPS. This would catch
configuration problems of users who are trying to set up NUT
incorrectly, while the driver is still in the foreground.  Is that
what the "fake" INIT command would do?

I don't quite see what this has to do with upsd, though. Why is upsd
supposed to wait for something from the driver, rather than simply
starting upsd after the driver forks? It seems to me that any startup
error handling should be done in upsdrvctl and/or the driver itself,
rather than upsd. 

What does a typical startup sequence look like? Is it something like
this? And if yes, where does MAXINIT and MAXAGE and DUMPALL come into
play? 

time 0: upsdrvctl start
time 1: driver1 tries to connect to UPS
time 2: driver1 has connected to UPS, driver forks to background
time 3: driver2 tries to connect to UPS
time 4: driver2 has connected to UPS, driver forks to background
time 5: upsdrvctl returns
time 6: upsd -c start
time 7: upsd tries to connect to driver1 and driver2
time 8: successfully connected
time 9: upsd forks to background
time 10: upsmon -c start
time 11: upsmon tries to connect to upsd
time 12: successfully connected
time 13: upsmon forks to background
time 14 (perhaps): upsd reads status from driver
time 15 (perhaps): upsmon reads status from upsd
...

Errors could happen at time 2, 4, 8, or 12, and happen in the
foreground. 

> I'm not proposing to change
> this, I just don't want to wait for the completion of the DUMPALL command.
> If we can send data to the driver and the socket is readable, I'm fairly
> confident that we can also receive the replies back, so there is no need
> to wait for that.
> 
> > Perhaps the driver should stay in the foreground until it has read valid
> > data from the UPS (with a timeout), so that upsd does not have to guess
> > how long this will take?
> 
> This is the current situation. And while making the timeout (user)
> configurable improves this a little, I think it would be much better to
> background immediately after sending the DUMPALL command.
> 
> > But perhaps this would slow down startup too much. Perhaps it's
> > sufficient to wait until *some* data has been read from the UPS. Is that
> > what is currently happening?
> 
> Currently, the DUMPALL command must have completely finished for each
> connected UPS. Therefor, if one driver is slow, it will delay all others too.
> 
> Yesterday evening I found that all that is needed is adding
> 
> 	state_setinfo(&ups->inforoot, "ups.status", "INIT");
> 
> to (almost) the end of sstate_connect() and we're done. No more need for
> MAXINIT (or initial_dump_wait() for that matter), the timeout for
> communication with the driver will be handled by MAXAGE. Which is much
> more intuitive for users as well, rather than having a separate timeout
> parameter for startup.
> 
> I'll commit the patch to the trunk later today.
> 
> Best regards, Arjen
>