Now that assemblers are represented via the `Stage` class, the
Assembler class is not needed anymore. Adjust the monitor method
to take an `pipeline.Stage` for the `assembler` method as well.
Instead of using the `Assemblers` class to represent assemblers,
use the `Stage` class: The `Pipeline.add_assembler` method will
now instantiate and `Stage` instead of an `Assembler`. The tree
that the pipeline built is converted to an Input (while loading
the manifest description in `format/v1.py`) and all existing
assemblers are converted to use that input as the tree input.
The assembler run test is removed as the Assembler class itself
is not used (i.e. run) anymore.
Instead of creating the temporary directory for the BuildRoot and
the sources output directly at the store root, create them inside
the store's temporary directory.
Support for inputs. Before the stage is executed all inputs of the
stage are run. The returned path is mapped inside the sandbox and
pass, along with the returned data, as part of the arguments to the
stage via new "inputs" dictionary. They keys represent the input
keys as given in the manifest.
Inputs are modules like Stages, Assemblers and Sources. Add them
as a new module klass to the various functions. Include them in
the schema test, so the schema of all inputs is validated.
Also sort the module classes alphabetically in the class mapping
and class list.
A pipeline input provides data in various forms to a `Stage`, like
files, OSTree commits or trees. The content can either be obtained
via a `Source` or have been built by a `Pipeline`. Thus an `Input`
is the bridge between various types of content that originate from
different types of sources.
The acceptable origin of the data is determined by the `Input`
itself. What types of input are allowed and required is determined
by the `Stage`.
To osbuild itself this is all transparent. The only data visible to
osbuild is the path. The input options are just passed to the
`Input` as is and the result is forwarded to the `Stage`.
The StoreServer and corresponding Client provide access to small
subset of the store methods to other process than the main osbuild one.
Currently it can be used to read trees of objects given their id and
create temporary directories within the store's tmp path.
The lifetime of the result of both operations are bound to the Server.
Now that the `Stage` contains the `ModuleInfo`, which contains the
path the to executable, this can directly be used to execute the
stage. To do so, the path to the executable is bind-mounted to a
well known path inside the sandbox (`/run/osbuild/bin/$id`) and
this is then supplied to the build root as executable to run.
Add a new `info` property that holds the `meta.ModuleInfo` info
for the stage. This gives each instance of a stage access to
meta (or class) information about it, i.e. its schema, docs but,
more importantly, also its name and path to the executable.
Thefore the `name` property is coverted into a transient property
which access the `name` member of `info`.
Change the `formats/v1` load mechanism to carry a new `index`
argument which is used to load the `ModuleInfo` for each stage.
Adapt all tests to load the info as well when creating stages.
Add the path to the executable for that module to the ModuleInfo.
This can then later be used to actually execute said module. The
information is already readily available since we used the path
to load the information from the file in the first place.
Instead of carrying around the `sources_options` parameter
through the recursive `load` and `load_build` calls, set
the sources options after loading has completed by iterating
through all stages of all pipelines.
All tests and invocations of `add_stage` actually pass a valid
options dictionary. Thefore move the `options` args before
the `sources` arg and remove the default value (`None`).
Instead of having build pipelines nested within the pipeline it is
the build pipeline for, the nested structure is transferred into a
flat list of pipelines. As a result the recursion is gone and all
the pipelines and trees are build one after the other. This is now
possible since floating objects are kept alive by the store itself
and all trees that are being built are transparently via them.
The immediate result dictionary changed accordingly. To keep the
JSON output of osbuild the same, the result is now routed through
a format specific converter.
Additionally, the v1 format module gained a function to retrieve
the global tree_id and output_id. With the new models those global
ids will go away eventually and thus need to go through the format
specific code.
This is a step towards generic pipelines, i.e. replacing assemblers
with pipelines, thus creating an acyclic graph of pipelines. There
the pipeline id will be what is now the tree_id. For now though the
generic id is either the output_id or the tree_id.
The objectstore always tracked all objects that were returned from
it, but it did so via weak references, which means it did not keep
the objects alive itself. With the introduction of identifiers for
temporary objects (floating objects), it makes sense to keep all
created objects alive so that they can in fact be used.
A "floating" object is a temporary object that is identified, i.e.
has an `id` and is thus also locked, but is not committed to the
store.
The `contains` and `get` methods of ObjectStore will now return such
floating objects as if they were committed ones, provind transparent
access to object that have been built during the exectuin of osbuild.
The current pipeline code used to set a base for a tree object
that might or might not exist. Depending on it it would either
use that object or reset its base. Avoid doing that because it
prohibits us from properly interpreting the `id` of an object
if the latter is also set when `base_id` is assigned, since
that base might not exist and thus the `id` would not actually
mean that the the contents of tree associated with the object.
Therefore we use `ObjectStore.get` and return the result if it
is not None or a fresh Object otherwise.
Every time a stage has been successfully built, the contents of
the tree now corresponds to the stage and can thus be identified
via the id of the stage.
When the tree is being written to, i.e. on consecutive attempts
of stage builds, the `id` of the tree object will automatically
be reset.
This adds a new `id` property to the ObjectStore.Object, that is
meant to reflect the identifer of the Stage to build the contents
of it. This will help to transparently access objects that have
been built but not committed to the store.
Setting the `base_id` of an object will also set its `id`. When
the object is then modified via write() the `id` will be set to
None, since no the content and the id are out of sync. In the
same way, restting an object will reset its `id` to None.
The object in question will be cleaned when the store goes out of
context, which happens soon after the manual cleanup anyway and
the eager cleanup does not gain us much.
More importantly, it removes the special case for the assembler
output object, since trees build by the stages are not cleaned
up manually already.
Instead of passing the store directory to Pipeline.run, pass an
already initialized ObjectStore object. This binds the lifetime
of the store and its (temporary) objects to the run of osbuild
not the run of the pipeline.
This prepares re-using the stores with multiple (non-nested)
pipelines.
Instead of a pipeline, describe now takes a Manifest instance.
The reason is that a manifest fully describes the build, which
includes the sources. Now that the describe function takes the
manifest, the sources can be included as well.
Adapt the tests to refelect that change.
The 'Manifest' class represents what to build and the necessary
sources to do so. For now thus it is just a combination of the
pipeline the source options.
The description of a pipeline is format dependent and thus needs
to be located at the specific format module.
Temporarily remove two tests; they should be added back to a format
specific test suit.
Instead of having the pipeline and the source option as separate
arguments, the load function now takes the full manifest, which
has those two items combined.
Instead of importing the load, load_build functions into the osbuild
namespace and using it via that, use the load function via the module
that provides them, i.e. the formats.v1 module.
The validation of the manifest descritpion is eo ipso format
specific and thus belongs into the format specific module.
Adapt all usages throughout the codebase to directly use the
version 1 specific function.
Extract the code that loads a pipeline from a pipeline description,
i.e. a manifest, into a new module inside a new 'formats' package.
The idea is to have different descriptions, i.e. different formats,
for the same internal representation. This allows changing the
internal representation, i.e. data structures, but still having the
same external description.
Later a new description might be added that better matches the new
internal representation.
This was deprecated in favor of always having the source in the
manifest. Remove the command line option and the corresponding
code that would override the sources definitions.
Update the docs accordingly.
The "/run/osbuild" path is used as the default runpath by the
BuildRoot, which creates it on demand. The only other place
is the API (`BaseAPI`) to create the socket directories in,
but that is now also created on-demand. Additionally, the
API are only run after the build root has been set up so that
directory would already exist.
When creating the socket directory, i.e. in the case that it was
not specified directly, ensure the parent directories exist.
Make it possible to override that parent directory.
Explicit re-raise the BufferError exception in recv from the orignal
JSONDecodeError, so the latter gets recorded as the underlying cause.
Uncovered by pylint 2.6.0: W0707: "Not using raise from makes the
traceback inaccurate, because the message implies there is a bug in
the exception-handling code itself, which is a separate situation
than wrapping an exception."
All runners stopped calling `api.setup_stdio` (commit c40b414), and
thus all output of runners and also modules is now redirected to a
pipe (created via Popen and subprocess.PIPE for stdout).
Text was read from that pipe via `stdout.read(4096)`, which means
that it is now buffered in chunks of 4096, where it previously was
line buffered in the case that osbuild was run in the terminal and
--json was not specified. This is very annoying for anyone wanting
to follow osbuild's output in real-time.
Restore the previous behavior by using `os.read`, which should be
a small wrapper around read(3), which does not block until all the
requested data is available but returns early (short reads). This
means, new text will be forwarded as soon is it is available in the
pipe. Increase the read buffer to 32768 while at it, which is what
Popen is using in Python 3.9.
osbuild_cli() sometimes returned an exit code, but at the end called
sys.exit() directly. The idea was probably to always return the code
with which the executable should exit.
Make this consistent and call sys.exit() in __main__.py, with the value
returned by osbuild_cli().
Metadata information can easily become very big, like in the case
of the package metadata of the org.osbuild.rpm stage, quite likely
exceeding the configured maximum package length of the underlying
socket. To avoid potential issues here, transfer the actual data
by writing it to a temporary file and sending a open fd over.
Extract the existing code that creates the runner for the host
build container into a small helper method, so it can be re-used
in other places, like the tests.
Use `traceback.print_tb()` to serialize the exceptions' backtrace.
The previously used expression `str(e.__traceback__)` will just
give `<traceback object at 0x…>`, which is not very helpful.
Add a test to check that the method name that raises the exception,
also called `exception`, is in the traceback.
When using `str(type(exception))` this ends up to be something like
`<class 'ValueError'>` for a `ValueError` exception. Get the vanilla
name of the exception type via `type(exception).__name__`.
Add a test to ensure that we encode this properly.