Although duration of each stage could be calculated from the start and
end timestamps, it is more convenient to have it directly in the log
entry. This way we can avoid calculating it in the client code.
When fully cached manifest is used, the duration used to print an
incorrect value of 55 years:
python3 -m osbuild --libdir . ./test/data/manifests/fedora-boot.json
⏱ Duration: 1753191578s
This patch fixes the duration calculation to use the correct timestamp
from the manifest by using monotonic timer instead. Additionally, it
prints nothing when there was no module executed. Finally, it improves
the formatting of the duration output.
This commit limits the output in the json pipeline to a "reasonable"
length. We ran into issues (e.g. [0]) from a combination of a stage
that produce tons of output (dracut, ~256 kb, see issue#1976) and
the consumer ("images" osbuild/monitor.go) that used a golang scanner
with a max default buffer of 64kb before erroring. So limit it
here.
The stage result from via json is mostly for information and any error
will most likely at the end. Plus consumers can collect the individual
log lines on their own if desired via the "log()" messages that are
stream in "real-time" with the added benefit that e.g. timestamps
can be added to the logs etc.
[0] https://issues.redhat.com/browse/RHEL-77988
This commit fixes a race/threading issue with the way the monitor
works. The osbuild monitor can be called from multiple threads,
e.g. in buildroot.py:run() monitor.log() is called but also
in host.py:_stdout_ready(). This can lead to out-of-order writes
when many messages need to be processed.
We did not notice this so far because we were lucky and also
log was just used for information. But now it is used to transmit
the jsonseq data which means out-of-order communication results
in broken json.
Closes: https://github.com/osbuild/image-builder-cli/issues/110
This commit adds error reporting from source download errors
to the monitor. It reuses the `BuildResult` for symmetry but
we probably want to refactor this a bit to make source handling
a bit more similar to stages.
In order to avoid having to rely on the output of `osbuild --json`
when using `--progress=JSONSeqMonitor` the monitor needs to include
the `osbuild.pipeline.BuildResult` for each individual stage.
This commit adds those to the montior.
The existing code to record progress was a bit too naive. Instead
of just counting the number os pipelines in a manifest to get the
total steps we need to look at the resolved pipelines.
with this fix `bib` will report the correct number of steps left
when doing e.g. a qcow2 image build. Right now the number of
steps is incorrect because the osbuild manifest contains pipelines
for qcow2,vdmk,raw,ami and all are currently considered steps
that need to be completed. With this commit this is fixed.
Since we support python3.6 we cannot assume that dicts are ordered
in any way. To ensure the `id` is still always valid we pass
sort_keys=True to json.dump().
Thanks to Simon!
When an alternative monitor like JSONSeqMonitor is used there is
still non json output printed to stdout. This was a TODO but
this commit removes it because it's okay, there is the
"--monitor-fd" that should be used when using the json-seq monitor.
This commit is somewhat poor, sorry for that. It mostly adds
workaround so that the osbuild sources can emit some progress
reporting as well. Without that the user experience is rather poor
and there is a long delay before any sort of progress can be
reported (even before the normal stages run).
With it the user experience is still not good but slightly better,
i.e. the progress monitor will report that the sources have
started downloading and curl will generated some log output. No
real progress unfortunately (sources subprogress will jump from
zero to 100%).
Generate log messages with origin "org.osbuild.main" when
pipelines/stages start and finish. This way a higher level
frontend can display high level progress coming from this
origin and filter out e.g. stages based log messages (that
are usually quite technical as they are just stdout/stderr
from the stages).
The existing JSONSeqMonitor was saving/restoring the "origin"
when generating a new log-entry. This allows logging from
different origins (e.g. "org.osbuild.main") in a kind of
"out-of-band" fashion.
But this save/restore feels slightly inelegant because
JSONSeqMonitor feels like the wrong layer to deal with this.
This is why a new `with_origin()` helper is introduced that
will either reuse the existing context or create a new one
with the requested origin.
Tweak the Progress class to be simpler. Given that progress does
not need to support arbitrary depth but only has a single level
the class now just exposes "sub_progress" to the caller.
When the main progress is advanced the sub_progress is now fully
deleted instead of just reset. The rational is that when the main
progress is done and advances a step it is very likely that a
new sub_progress is required and it's most likely an error if
the same sub_progress will get re-used.
This means that `reset()` can be removed as it's not used anymore
(and YAGNI). We can add it back when we have a use-case.
It also change the code so that "total" starts with 0 instead
of `None` (principle of least surprise). This means that now
`progress.incr()` is called in the JSONSeqMonitor() for
`finish()` and `result()` to indicate that the pipeline/stage
is finished.
This commit tweaks Context a bit so that any write will automatically
reset the `_id`. This ensures that we do not forget to reset `_id`
when the code changes.
It also tweaks the naming a bit, before there was a "setter" for
origin and functions to set "pipeline" and "stage". They are all
functions now with a "set_" prefix for symetry mostly.
The class LogLine() is purely used as a dataclass with no state
and the only function on it is `as_dict()`. This got refactored
into a new function `log_entry()` because there is no need for
this to be a class. The function that takes the same inputs.
This changes the way monitors are initialized to always include
a `manifest` and now they will always log a `start` message.
This makes creation of a monitor symetric accross all monitors
again and no special cases for JSONSeqMonitor are required.
It means one needs to run the json-seq monitor with:
```
$ python3 -m osbuild --monitor=JSONSeqMonitor
```
Fwiw, I'm not 100% confident this is a win but it feels slightly
more right then the special cases in `main_cli.py` that is
replaces.
New monitor type that emits a JSON object for each log message.
Unlike other monitors:
- The constructor takes a build manifest as argument to initialise the
Pipeline and Stage counts for the Progress part of the report.
- It doesn't print a 'result' at the end of the build.
- The log() method that prints a log message supports specifying an
origin to override the default that's set by the constructor.
Although the Logline supports reporting errors separately, this isn't
used yet.
Signed-off-by: Achilleas Koutsou <achilleas@koutsou.net>
Foundation for new monitor type that will emit a JSON object for each
log message. The following classes are defined:
- LogLine: The top-level object that can be serialised into a single
object containing a message and associated metadata.
- Context: Contextual information for a log line message. Describes the
origin of the message and the current pipeline and stage.
Automatically deduplicates this information using a hash/ID: keeps a
history of IDs and omits the context when the context is not new.
- Progress: Information on the progress of the build. The object is
recursive: can contain a sub-progress for nested progress reporting
(pipelines > stages > ...).
Signed-off-by: Achilleas Koutsou <achilleas@koutsou.net>
Now that assemblers are represented via the `Stage` class, the
Assembler class is not needed anymore. Adjust the monitor method
to take an `pipeline.Stage` for the `assembler` method as well.
Allow a user to see the duration for each step in the osbuild pipeline.
This allows a user to optimize the build system for the best performance
and identify performance bottlenecks.
Signed-off-by: Major Hayden <major@redhat.com>
Introduce the concept of pipeline monitoring: A new monitor class is
passed to the pipeline.run() function. The main idea is to separate
the monitoring from the code that builds pipeline. Through the build
process various methods will be called on that object, representing
the different steps and their targets during the build process. This
can be used to fully stream the output of the various stages or just
indicate the start and finish of the individual stages.
This replaces the 'interactive' argument throughout the pipeline
code. The old interactive behavior is replicated via the new
`LogMonitor` class that logs the beginning of stages/assembler,
but also streams all the output of them to stdout.
The non-interactive behavior of not reporting anything is done by
using the `NullMonitor` class, which in turn outputs nothing.