This software is designed around a layered scheme with drivers, a data server, and clients. These layers communicate with text-based protocols for easier maintenance and diagnostics.
The NUT driver(s) and the data server run on the same system, which has some communications media connected to the power device (e.g. a serial or USB link, a local IPMI interface, or a network interface in the engineering VLAN). While each driver program talks the device vendor-defined protocol over such media, it also talks the local NUT Socket protocol to the local data server.
Clients connect to the data server using the common NUT Network protocol
over TCP, whether on localhost
or remotely.
One most notable client is upsmon
, which is responsible for shutdown
of the system it runs on, when the power situation becomes critical.
Design-wise, it normally splits into two daemons to minimize security
risks: one remains with root privileges and is only used to start
the configured SHUTDOWNCMD
when the time comes, and the other half
drops privileges and does the bulk of work.
There were requests for enhancement to also implement connectivity using the common NUT Network protocol using local sockets, so clients running on the same machine as the data server would not have to always use the TCP/IP stack; however this is currently not implemented.
DRIVERS talk to the EQUIPMENT and receive updates. For most hardware this is polled (DRIVER asks EQUIPMENT about a variable), but forced updates are also possible. The exact method is not important, as it is abstracted by the driver.
The core of all DRIVERS maintains internal storage for every variable that is known along with the auxiliary data for those variables. It sends updates to this data to any process which connects to the Unix domain socket.
The DRIVERS will also provide a full atomic copy of their internal knowledge upon receiving the "DUMPALL" command on the socket. The dump is in the same format as updates, and is followed by "DUMPDONE". When "DUMPDONE" has been received, the view is complete.
The SERVER will connect to the socket of each DRIVER and will request a dump at that time. It retains this data in local storage for later use. It continues to listen on the socket for additional updates.
This protocol is documented in sock-protocol.txt.
The SERVER’s internal storage maintains a complete copy of the data which is in the DRIVER, so it is capable of answering any request immediately. When a request for data arrives from a CLIENT, the SERVER looks through the internal storage for that UPS and returns the requested data if it is available.
The format for requests from the CLIENT is documented in protocol.txt.
"Instant commands" is the term given to a set of actions that result in
something happening to the UPS. Some of the common ones are
test.battery.start
to initiate a battery test and test.panel.start
to
test the front panel of the UPS.
They are passed to the SERVER from a CLIENT using an authenticated network connection. The SERVER first checks to make sure that the instant command is valid for the DRIVER. If it’s supported, a message is sent via a socket to the DRIVER containing the command and any auxiliary information.
At this point, there is no confirmation to the SERVER of the command’s
execution. This is (still) planned for a future release. This has been
delayed since returning a response involves some potentially interesting
timing issues. Remember that upsd
services clients in a round-robin
fashion, so all queries must be lightweight and speedy.
FIXME: Wasn’t "TRACKING" mechanism for "INSTCMD/SET VAR" introduced to address just this? See https://github.com/networkupstools/nut/pull/659
Some variables in the DRIVER or EQUIPMENT can be changed, and carry the FLAG_RW flag. Upon receiving a SET command from the CLIENT, the SERVER first verifies that it is valid for that DRIVER in terms of writability and data type. If those checks pass, it then sends the SET command through the socket, much like the instant command design.
The DRIVER is expected to commit the value to the EQUIPMENT and update its internal representation of that variable.
Like the instant commands, there is currently no acknowledgement of the command’s completion from the DRIVER. This, too, is planned for a future release.
FIXME: Wasn’t "TRACKING" mechanism for "INSTCMD/SET VAR" introduced to address just this? See https://github.com/networkupstools/nut/pull/659
Here’s the path a piece of data might take through this architecture. The event is a UPS going on battery, and the final result is a pager delivering the alpha message to the admin.
ups.status
variable as
OB. This update gets pushed out to any listeners via the sockets.
upsd
sees activity on the socket, reads it, parses it, and
commits the new data to its local version of the status variable.
upsmon
does a routine poll of SERVER for ups.status
and
gets OB
.
upsmon
then invokes its NOTIFYCMD
which is upssched
.
upssched
starts up a daemon to handle a timer which will expire about
30 seconds into the future.
upssched
calls the CMDSCRIPT
which is upssched-cmd
.
upssched-cmd
parses the args and calls sendmail
.
This scenario requires some configuration, obviously:
upsd
has a valid UPS entry in ups.conf for this UPS.
[myups] driver = nutupsdrv port = /dev/ttySx
upsd
has a valid user for upsmon
in upsd.users file.
[monuser] password = somepass upsmon primary
upsmon
is set to monitor this UPS with this user in upsmon.conf file.
MONITOR myups@localhost 1 monuser somepass primary
upsmon
is set to EXEC
the NOTIFYCMD
for the ONBATT
condition in
upsmon.conf file.
NOTIFYFLAG ONBATT EXEC
upsmon
calls upssched
as the NOTIFYCMD
in upsmon.conf file.
NOTIFYCMD /path/to/upssched
upssched
has a 30 second timer for ONBATT
in upssched.conf file.
AT ONBATT * START-TIMER upsonbatt 30
upssched
calls upssched-cmd
as the CMDSCRIPT
in upssched.conf.
CMDSCRIPT /path/to/upssched-cmd
upssched-cmd
knows what to do with upsonbatt
keyword as its first
argument (a quick case..esac
construct, see the examples)
The oldest versions of this software (1998) had no separation between the driver and the network server, and only supported the latest APC Smart-UPS hardware as a result. The network protocol used brittle binary structs. This had numerous bad implications for compatibility and portability.
After the driver and server were separated, data was shared through the
state file concept. Status was written into a static array (the "info
array") by drivers, and that array was stored on disk. The upsd
would
periodically read that file into a local copy of that array.
Shared memory mode was added a bit later, and that removed some of the lag from the status updates. Unfortunately, it didn’t have any locking originally, and the possibility for corruption due to races existed.
mmap()
support was added at some point after that, and became the
default. The drivers and upsd
would mmap()
the file into memory and
read or write from it. Locking was done using the state file as the
token, so contention problems were avoided. This method was relatively
quick, but it involved at least 3 copies of the data (driver, disk/mmap,
server) and a whole lot of locking and unlocking. It could occasionally
delay the driver or server when waiting for a lock.
In April 2003, the entire state management subsystem was removed and replaced with a single local socket. The drivers listen for connections and push updates asynchronously to any listeners. They also recognize a few commands. Drivers also dampen the flow of updates, and only push them out when something actually changes.
As a result, upsd
no longer has to poll any files on the disk, and can
just select()
all of its file descriptors (fds) and wait for activity.
When one of them is active, it reads the fd and parses the results.
Updates from the hardware now get to upsd
about as fast as they possibly
can.
Drivers used to call setinfo()
to change the local array, and then would
call writeinfo()
to push the array onto the disk, or into the
mmap/shared memory space. This introduced a lag since many drivers poll
quite a few variables during an update.
By 2013 much of the work on NUT for Windows branch (based off the NUT v2.6.5 release) was completed, adding named pipes as the equivalent to local sockets as well as to cross-program signals. This work got a face-lift and was merged into the main code base about a decade later, in 2022.
In April 2023 (eventually released with NUT v2.8.1 and enhanced/fixed in later releases), a new use-case was added: interactions of two instances of a driver program over the local socket, as an alternative to signals for the already-running driver to reload configuration, exit and make way for a new instance of the driver daemon, or command the UPS to kill power without the overhead of a new connection made by such new instance.