Introduction
The log files for each module are in the folder %SiteController%/log. Log files will be rotated depending on your configuration, so please whenever you have a problem collect the log files as soon as possible. The internal service name will be useful when troubleshooting, as you will find a log text file with the same name as the internal service name with the extension ".log".
Framework
- our implementation is based on the standard python logging framework: https://docs.python.org/2/library/logging.html
- we summerazized loading of config etc. in the
azeti_logging.py
Configuration
- see
SiteController.cfg
(formerly inazeti_logging.cfg
)
Standard Settings:
- logger=azeti_file
- by default should be
level=INFO
- SizeRotatingFileHandler
- rotation after 10490000 Bytes and 5 backup-files configure
other useful handlers:
Rotating by filesize: https://docs.python.org/2/library/logging.handlers.html#logging.handlers.RotatingFileHandler
Have a look into the official docs to find examples for file size based rotation.
Changing the size rotation parameters
The default for the file size dependent parameters in the SiteController.cfg is
[handler_SizeRotatingFileHandler] # Size based rotation of log files formatter = simpleFormatter class = handlers.RotatingFileHandler # Rotate if a file exceeds 1049000 bytes (1 MiB) and keep 5 old files args = (['%(logfilename)s', 'a', 10490000, 5])
There is no need to have default parameters present in a SiteController.cfg file. Default values are used when there is no entry in the SiteController.cfg
To change the count of old backup files per logfilename from default (5) to 15 add following snippet to the specific SiteController.cfg
[handler_SizeRotatingFileHandler] # Rotate if a file exceeds 1049000 bytes (1 MiB) and keep 15 old files args=(['%(logfilename)s', 'a', 10490000, 15])
Levels
Level | Numeric value | Intended purpose |
---|---|---|
CRITICAL | 50 | Used to log events so grave it causes the process not being able to continue running (e.g. a not specificially handled runtime error). |
ERROR | 40 | Used to log events which indicate an erroneous condition but the process can continue its processing by ignoring the fact (e.g. ignoring a contradictory configuration entry) |
WARNING | 30 | Used to log events which indicate erroneous or probably unwanted conditions and the process takes workaround measures to continue processing (e.g. a missing file and the process uses default values instead) |
INFO | 20 | Used to log seldom (!) informative events like starting and stopping of processes or configuration changes. |
DEBUG | 10 | Used to log anything else, especially any detailled log messages to debug a certain condition. |
NOTSET | 0 | This is not a real log level but causes the logger to determine the actual log level by looking somewhere else (refer to the Python documentation). |
General rules:
No log level (with the only exception of Debug level) is allowed to log messages periodically!
Example:
- If there is a state where e.g. a connection keeps failing to recover, log it once in Warning state, log it a second time after a while as Errorstate but as long as the state has not changed in between, with that error message, stop complaining about it in the logs.
INFO
Every module should log when it starts and when it finishes with a common structure (meaning we decide a sentence like "Module ... starting".
Examples:
- Start/Stop of module
- overview (and readable – that means not just a json-dump) of configuration changes
- Other things that may be interesting for for a user to know but aren't warnings or errors
WARNING
A circumstance that is not expected but something that will not affect the functionality itself but the result may not what the user is expecting or is likely just a temporary thing.
Examples:
- There is no mosquitto found at the location the configured mosquitto path was set to, however by searching the path, there is another one found and the module takes this location instead
- There is no such configured serial port, but since the system has only one of them, the daemon takes it instead.
- A host address is suddenly not reachable anymore (assuming a temporary network issue).
ERROR
Any circumstance that is not expected, affect the proper functioning of the module, and may lead to the user receiving wrong information or a false perception about something working. The system is not able to resolve on its own. The module itself is not able to continue checking that particular part, however anything else in the module is not affected and the module continues to check anything else.
Examples:
- The configured serial interface is wrong but there is more than one of them in the system.
- A host address is unreachable and has never been seen alive before
- A host that stays offline for a longer period of time
CRITICAL
Is an error where the module is not able to recover and unrelated checks are to be expected to be affected, too. This might lead to information loss.
Examples:
- Out of memory
- Corrupted database file
- No mosquitto broker found
- Unhandled exceptions
DEBUG
Information that the developer consider necessary to debug the module, meaning in this mode we show as much information as it is required to show what is exactly happening at every point.
Next Steps
- Next Article Modules