Logging¶
The following verbosity levels are available:- quiet: display only errors
- normal: informative and warning, as well as errors
- verbose: everything above plus debug information
Logging is always done on console, and verbosity can be adjusted in log/console_level.
Logging to file can be done if a log directory with proper permissions is given in log.file.directory. The verbosity can be adjusted in log/file/level.
Scheduling and Processing¶
The spooling/watch_path directory is watched (using inotify).
If spooling/file_name_pattern is set to a regex, then the file MUST match or be rejected. If the regex contains named capture groups, then the corresponding data will be associated to the file for later pattern replacement.
If spooling/file_content_patterns is set to a list of regex, then the file is read and each regex is tested against each line of the file. Contrary to spooling/file_name_pattern, these regex may not match, and are only useful to fetch data using named capture groups. If file_content_patterns_case_insensitive then the regex will ignore case for matching.
A file is considered unique based on its key, which is by default its file name. if you need a more flexible method, you can override the spooling/file_keys list of strings, with each <var> being replaced by previously fetched data. The first resulting string in the list to be fully determined (all
%<var>% are defined) will be the final key; this allows for fallbacks when file content does not always contains all patterns.
If two files have the same key, they are considered identical, and the second file is not scheduled but deleted. You may wish to consider cancelling an operation instead, which can be done by setting spooling/cancel_key to a string similar to spooling/file_key. In this case, if both files have the same key but a different cancel key, both are deleted (if the cancel key is identical, the new file is considered a duplicate and deleted as usual).
If spooling/min_age is set to a number of seconds, then a new file won't be considered for scheduling until it is old enough. If spooling/max_age is set to a number of seconds, then a file this old will be rejected and deleted.
Each time a file comes in, or has finished to be processed, and every spooling/reschedule_interval seconds, the program tries to process scheduled files. it will run as many as spooling/max_concurrency file processing at the same time.
The file processing is done by invoking spooling/command. The fetched file data can be used to provide arguments to the command using <var> patterns as seen previously.
Load Check¶
If you want to check an external condition to allow processing, for example when the system load is not too high, then you can specify either an external script or a Ruby plugin in load_check/script. The method (exec/plugin) must be set in load_check/type. It will be invoked every load_check/interval seconds (defaults to 120s).
An external script must echo "0" or "1", and a Ruby plugin a boolean. "1" or true everything's all right and processing of files will occur, while "0" or false means no job will be scheduled. If the script returns a non-null exit code or echo an unexpected string (or nothing), or if the Ruby plugin raise an exception, the load status will not be updated. The load is considered to be OK at startup.
A Ruby plugin must provide a 'check' method in the top namespace. It will be loaded in an anonymous Module and transformed into a module function. You are free to load external libraries, create others methods or inner classes/modules but the simpler the better.
The script or plugin processing may run for a long time without endangering the files processing, nevertheless no
more than one will be run at the same, so if it never ends the load status will remain unchanged forever.
Updated by Marc Dequènes over 6 years ago · 2 revisions