Commit graph

120 commits

Author SHA1 Message Date
Christian Kellner
1ed85dc790 inputs: convert to host service
Create a `InputService` class with an abstract method called `map`,
meant to be implemented by all inputs. An `unmap` method may be
optionally overridden by inputs to cleanup resources.
Instantiate a `host.ServiceManager` in the `Stage.run` section and
pass the to the host side input code so it can be used to spawn the
input services.
Convert all existing inputs to the new service framework.
2021-06-09 18:37:47 +01:00
Christian Kellner
08bc9ab7d8 inputs: pre-defined input paths
Instead of bind-mounting each individual input into the container,
create a temporary directory that is used by all inputs and bind-
mount this to the well known location ("/run/osbuild/inputs"). The
temporary directory is then passed to the input so that it can
make the requested resources available relative to that directory.
This is enforced by the common input handling code.
Additionally, pass the well known input path via a new "paths" key
to the arguments dictionary passed to the stage.
2021-06-09 18:37:47 +01:00
Christian Kellner
ef5e9364bb inputs: make inputs aware of their names
The name of the input here refers to its id within the manifest. This
is unique per stage and thus identifies a input for a given stage.
2021-06-09 18:37:47 +01:00
Christian Kellner
c9327a7a79 pipeline: remove left-over temp directory
The source temporary directory was left over from the time when
stages were using the source server API.
2021-06-09 18:37:47 +01:00
Christian Kellner
f1b406a774 pipeline: remove sources server
All sources are now pre-fetched before any pipeline and thus any
stage is being built. Additionally, in the version 1 foramt, all
stages that were using source are converted to use inputs when
the manifest is loaded. Thus, nothing should use `source.get`
and thus the sources API (`SourcesServer`) anymore.
2021-04-29 12:58:01 +02:00
Christian Kellner
47a81ff3ed pipeline: ability to checkpoint by pipeline name
Since pipelines can now be uniquely addressed via their names,
add the ability to checkpoint via the pipeline name. This will
effectively checkpoint the last stage of a pipeline.
For format v1 manifests, the build pipeline is called "build",
the main pipeline is called "tree" and the pipeline for the
assembler is called "assembler".
2021-02-19 14:42:32 +00:00
Christian Kellner
931eac23c3 sources: introduce source items
All sources fetch various types of `items`, the specific nature
of which is dependent on the source type, but they are all
identifyable by a opaque identifier. In order for osbuild to
check that all the inputs that a stage needs are are indeed
contained in the manifest description, osbuild must learn what
ids are fetched by what source. This is done by standarzing
the common "items" part, i.e. the "id" -> "options for that id"
mapping that is common to all sources.
For the version 1 of the format, extract the files and ostree
the item information from the respective options.
Adapt the sources (files, ostree) so that they use the new items
information, but also fall back to the old style; the latter is
needed since the sources tests still uses the SourceServer.
2021-02-10 15:44:24 +01:00
Christian Kellner
9e8b254687 stage: introduce add_input method
In much the same way has `Pipeline` has `add_stage` and `Manifest`
has `add_pipeline`, introduce an `add_input` method to `Stage` to
be able add `Inputs` to stages.
2021-02-06 12:04:30 +01:00
Christian Kellner
12c9173faf manifest: use newly introduced Source class
Add a new `add_source` method that will add an individual `Source`
to a `Manifest` give its `ModuleInfo` and options. The dictionary
of source options in the manifest is replaced with a list of such
`Sources` and `add_source` will append to it. Adap the version 1
format code to use `add_source` and reconstruct the source options
from the list of source on `describe`.
Remove the `sources_options` constructor parameter for `Manifest`
and adapt all the source base for this.
2021-02-06 12:04:30 +01:00
Christian Kellner
0fba88d7a9 stage: include inputs in id calculation
Include all inputs of a stage during the calculation of its id,
since they determine, very much like options, the content the
stage produces; thus different inputs should lead to different
ids.
2021-01-22 15:03:19 +01:00
Christian Kellner
c6432d0adb pipeline: replace tree_id with id
Now that `Pipelines` have no assemblers anymore and thus only one
identifier, i.e. the one corresponding to the tree (`tree_id`),
the `id` and `tree_id` are now the same. Therefore replace the
usage of `tree_id` with `id` and drop the former. Add some extra
documentation including some caveats about the uniquness of `id`.
2021-01-22 15:03:19 +01:00
Christian Kellner
53e9ec850b osbuild: assemblers are pipelines now
Convert the assembler phase of the main pipeline in the old format
into a new Pipeline that as the assembler as a stage, where the
input of that stage is the main pipeline. This removes the need of
having "assemblers" as special concepts and thus the corresponding
code in `Pipeline` is removed. The new assembler pipeline is marked
as exported, but the pipeline that builds the tree is not anymore.
Adapt the `describe` and `output` functions of the `v1` format to
handle the assembler pipeline. Also change the tests accordingly.

NB: The id reported for the assembler via `--inspect` and the result
will change as a result of this, since the assembler stage is now
the first and only stage of a new pipeline and thus has no base
anymore.
2021-01-22 15:03:19 +01:00
Christian Kellner
91aa0c6e88 manifest: add __contains__ method
Since `__iter__` is return an iterator over the `Pipeline` objects,
the `"name" in manifest` check would not work for name or ids. Thus
provide an implemention of `__contains__` that does exactly that.
2021-01-22 15:03:19 +01:00
Christian Kellner
18671686ee manifest: add get method to lookup pipelines
Add a new helper helper, `Manfifest.get` that will return a
pipeline give a name or an id or `None` if no pipeline could
be found with either. The implementation is taken from the
existing `__getitem__` method and the latter as now based on
the new `get` method.
2021-01-22 15:03:19 +01:00
Christian Kellner
569345cc72 pipeline: identify pipelines by name
Every pipeline that gets added to the `Manifest` now need to have
a unique name by which it can be identified. The version 1 format
loader is changed so that the main pipeline that builds the tree
is always called `tree`. The build pipeline for it will be called
`build` and further recursive build pipelines `build-build`, where
the number of repetitions of `build` corresponds to their level of
nesting. An assembler, if it exists, will be added as `assembler`.
The `Manifest.__getitem__` helper is changed so it will first try
to access pipeline via its name and then fall back to an id based
search. NB: in the degenrate case of multiple pipelines that have
exactly the same `id`, i.e. same stages, with the same options and
same build pipeline, only the first one will be return; but only
the first one here will be built as well, so this is in practice
not a problem.
The formatter uses this helper to get the tree pipeline  via its
name wherever it is needed.
This also adds an `__iter__` method `Manifest` to ease iterating
over just the pipeline values, a la `for pipeline in manifet`.
2021-01-22 15:03:19 +01:00
Christian Kellner
88acd7bb00 pipeline: make them "exportable"
Add a new `export` property to the `Pipeline` object that indicates
whether a the result, i.e. the tree after the pipelines has been
built, should be exported, i.e. copied to the output directory.
In the current format (v1), the main pipeline, gets marked as such
by the corresponding loader.
2021-01-22 15:03:19 +01:00
Christian Kellner
42dd3c1e2d manifest: add and use add_pipeline method
Instead of passing all pre-created pipelines to the Manifest
constructor, add a `add_pipeline` method, analogous to the
existing `Pipeline.add_{stage, assembler}` methods. Convert
the format loading code to use that and remove the constructor
parameter.
2021-01-22 15:03:19 +01:00
Christian Kellner
6c02002cbd pipeline: remove Assembler class
Now that assemblers are represented via the `Stage` class, the
Assembler class is not needed anymore. Adjust the monitor method
to take an `pipeline.Stage` for the `assembler` method as well.
2021-01-19 10:42:26 +01:00
Christian Kellner
8ccc73d1c3 pipeline assemblers are stages now
Instead of using the `Assemblers` class to represent assemblers,
use the `Stage` class: The `Pipeline.add_assembler` method will
now instantiate and `Stage` instead of an `Assembler`. The tree
that the pipeline built is converted to an Input (while loading
the manifest description in `format/v1.py`) and all existing
assemblers are converted to use that input as the tree input.

The assembler run test is removed as the Assembler class itself
is not used (i.e. run) anymore.
2021-01-18 17:44:46 +01:00
Christian Kellner
ff7696a92e pipeline: return objects from add methods
Return the Assembler and Stage that got added from their respective
methods.
2021-01-18 17:44:46 +01:00
Christian Kellner
1a3c8e85c6 stage: temp dirs within store's tmp dir
Instead of creating the temporary directory for the BuildRoot and
the sources output directly at the store root, create them inside
the store's temporary directory.
2021-01-18 17:44:46 +01:00
Christian Kellner
0bb3121273 stage: provide loop-server to stages
This makes it possible for stages to create loop devices and further
aligns Stages and Assemblers.
2021-01-18 17:44:46 +01:00
Christian Kellner
1297922a57 stage: add support for inputs
Support for inputs. Before the stage is executed all inputs of the
stage are run. The returned path is mapped inside the sandbox and
pass, along with the returned data, as part of the arguments to the
stage via new "inputs" dictionary. They keys represent the input
keys as given in the manifest.
2021-01-18 17:44:46 +01:00
Christian Kellner
93010c7e16 stage: use exist stack in the run method
Simplify context management in the `Stage.run` method by using an
`ExitStack` instead of a multiline `with` statement.
2021-01-18 17:44:46 +01:00
Christian Kellner
d028ea5b16 stage: pass store instead of cache & var
Instead of passing separate `cache` and `var` variables, which are
both determined by the store, just pass the store.
2021-01-18 17:44:46 +01:00
Christian Kellner
de021b468a stage: use path meta info to run the stage
Now that the `Stage` contains the `ModuleInfo`, which contains the
path the to executable, this can directly be used to execute the
stage. To do so, the path to the executable is bind-mounted to a
well known path inside the sandbox (`/run/osbuild/bin/$id`) and
this is then supplied to the build root as executable to run.
2021-01-18 17:44:46 +01:00
Christian Kellner
7a6c2df910 stage: add module information about itself
Add a new `info` property that holds the `meta.ModuleInfo` info
for the stage. This gives each instance of a stage access to
meta (or class) information about it, i.e. its schema, docs but,
more importantly, also its name and path to the executable.
Thefore the `name` property is coverted into a transient property
which access the `name` member of `info`.
Change the `formats/v1` load mechanism to carry a new `index`
argument which is used to load the `ModuleInfo` for each stage.
Adapt all tests to load the info as well when creating stages.
2021-01-18 17:44:46 +01:00
Christian Kellner
698635171c pipeline: refactor args for add_stage
All tests and invocations of `add_stage` actually pass a valid
options dictionary. Thefore move the `options` args before
the `sources` arg and remove the default value (`None`).
2021-01-18 17:44:46 +01:00
Christian Kellner
262877091f osbuild: flatten the pipeline
Instead of having build pipelines nested within the pipeline it is
the build pipeline for, the nested structure is transferred into a
flat list of pipelines. As a result the recursion is gone and all
the pipelines and trees are build one after the other. This is now
possible since floating objects are kept alive by the store itself
and all trees that are being built are transparently via them.
The immediate result dictionary changed accordingly. To keep the
JSON output of osbuild the same, the result is now routed through
a format specific converter.
Additionally, the v1 format module gained a function to retrieve
the global tree_id and output_id. With the new models those global
ids will go away eventually and thus need to go through the format
specific code.
2021-01-15 13:20:31 +01:00
Christian Kellner
54761e8a13 pipeline: introduce generic pipeline id
This is a step towards generic pipelines, i.e. replacing assemblers
with pipelines, thus creating an acyclic graph of pipelines. There
the pipeline id will be what is now the tree_id. For now though the
generic id is either the output_id or the tree_id.
2021-01-15 13:20:31 +01:00
Christian Kellner
e24dfbd23f pipeline: don't use a non-existing base for trees
The current pipeline code used to set a base for a tree object
that might or might not exist. Depending on it it would either
use that object or reset its base. Avoid doing that because it
prohibits us from properly interpreting the `id` of an object
if the latter is also set when `base_id` is assigned, since
that base might not exist and thus the `id` would not actually
mean that the the contents of tree associated with the object.
Therefore we use `ObjectStore.get` and return the result if it
is not None or a fresh Object otherwise.
2021-01-15 13:20:31 +01:00
Christian Kellner
b039761544 pipeline: identify tree objects during build
Every time a stage has been successfully built, the contents of
the tree now corresponds to the stage and can thus be identified
via the id of the stage.
When the tree is being written to, i.e. on consecutive attempts
of stage builds, the `id` of the tree object will automatically
be reset.
2021-01-15 13:20:31 +01:00
Christian Kellner
a8783761a1 pipeline: don't eagerly clean up the final object
The object in question will be cleaned when the store goes out of
context, which happens soon after the manual cleanup anyway and
the eager cleanup does not gain us much.
More importantly, it removes the special case for the assembler
output object, since trees build by the stages are not cleaned
up manually already.
2021-01-15 13:20:31 +01:00
Christian Kellner
f38c48086e pipeline: run method takes store object not dir
Instead of passing the store directory to Pipeline.run, pass an
already initialized ObjectStore object. This binds the lifetime
of the store and its (temporary) objects to the run of osbuild
not the run of the pipeline.
This prepares re-using the stores with multiple (non-nested)
pipelines.
2021-01-15 13:20:31 +01:00
Christian Kellner
8d2c7f8160 osbuild: move mark_checkpoints to manifest
Make the checkpoint marking logic a method of the Manifest class.
2021-01-09 18:09:47 +01:00
Christian Kellner
945914b195 osbuild: introduce Manifest class
The 'Manifest' class represents what to build and the necessary
sources to do so. For now thus it is just a combination of the
pipeline the source options.
2021-01-09 18:09:47 +01:00
Christian Kellner
4ab52c3764 formats: move pipeline description here
The description of a pipeline is format dependent and thus needs
to be located at the specific format module.
Temporarily remove two tests; they should be added back to a format
specific test suit.
2021-01-09 18:09:47 +01:00
Christian Kellner
aaf61ce9fc formats: extract manifest loading into module
Extract the code that loads a pipeline from a pipeline description,
i.e. a manifest, into a new module inside a new 'formats' package.
The idea is to have different descriptions, i.e. different formats,
for the same internal representation. This allows changing the
internal representation, i.e. data structures, but still having the
same external description.
Later a new description might be added that better matches the new
internal representation.
2021-01-09 18:09:47 +01:00
Christian Kellner
27d4450352 pipeline: don't create "/run/osbuild" eagerly
The "/run/osbuild" path is used as the default runpath by the
BuildRoot, which creates it on demand. The only other place
is the API (`BaseAPI`) to create the socket directories in,
but that is now also created on-demand. Additionally, the
API are only run after the build root has been set up so that
directory would already exist.
2020-12-04 12:28:30 +01:00
Christian Kellner
e919f66609 pipeline: use osrelease.DEFAULT_PATHS
Use the newly defined constant that contains the well known paths
for where to look for `os-release` file.
2020-10-21 11:13:28 +02:00
Christian Kellner
807090f4c8 pipeline: introduce detect_host_runner helper
Extract the existing code that creates the runner for the host
build container into a small helper method, so it can be re-used
in other places, like the tests.
2020-10-21 11:13:28 +02:00
Christian Kellner
f5d00dd043 api: use more generic error member for exceptions
Rename the `API.exception` member to `API.error`, to make it more
generic, so it can also be used for other sort of errors in the
future. Also add a layer of additional structure with `type` and
`data` members so different types of errors apart. Currently only
`exception` is used.
Adapt the tests in test/mod/test_api.py to check for the new
structure and its content.
2020-10-09 10:47:44 +02:00
Chloe Kaubisch
5dc5ddcf29 api: add exception endpoint
Create a new api endpoint called exception, that communicates
exception backtraces separately back to osbuild, as opposed to
dumping them into the normal log. Additionally, add a corresponding
test to check that a call to api.exception correctly sets
API.exception.
2020-10-02 17:49:45 +02:00
chloenayon
b1229de56e pipeline: unify object exporting
Remove output.export and associated logic in pipeline.assemble.
Instead, return output or None, and export only once in pipeline.run.
2020-09-02 17:54:11 +02:00
Christian Kellner
499ae1654e osbuild: replace api.setup_stdio with BuildRoot
Now that the BuildRoot is capable of capturing the output of the
runner and modules (stages, assemblers), there is no need for
using `api.setup_stdio`. Therefore, drop it from all runners and
replace `api.output` with `BuildRoot.output`, which will contain
the output if `api.setup_stdio` is not called from the runners.
2020-08-31 15:06:36 +02:00
Christian Kellner
96a5499ed9 buildroot: log bubblewrap's output
In case that bubblewrap fails to, e.g. because it fails to execute
the runner, it will print an error message to stderr. Currently,
this output is not capture and thus not logged. To fix that, the
`BuildRoot.run` method now takes a monitor object and will stream
stdout/stderr to the log via the monitor.
2020-08-27 08:07:14 +02:00
chloenayon
3bf5d26c7a pipeline: replace objectstore logic with get call
In pipeline.run, replace calls to objectstore.contains
and objectstore.new with a call to objectore.get, which
has the same functionality.
2020-08-26 15:10:12 +02:00
David Rheinsberg
803433fb62 api: prevent early output retrieval
Change the API endpoint to prevent retrieving monitor-output from a
running instance. Instead, we require the caller to exit the API context
before querying the monitor-output. This guarantees that the api-thread
was synchronously taken down and scheduled any outstanding events.

This fixes an issue where a side-channel notifies us of a buildroot
exit, but the api-thread has not yet returned from epoll, and thus might
not have dispatched pending I/O events, yet. If we instead wait for the
thread to exit, we have a synchronous shutdown and know that all
*ordered* kernel events must have been handled.

In particular, imagine a build-root program running (like `echo` in the
test_monitor unittest) which writes data to the stdout-pipe and then
immediately exits. The syscall-order guarantees that the data is written
to the pipe before the SIGCHLD is sent (or wait(2) returns). However, we
retrieve the SIGCHLD from our main-thread usually (p.join() in our test,
and BuildRoot() in our main code), while the pipe-reading is done from
an API thread. Therefore, we might end up handling the SIGCHLD first
(just imagine a single-threaded CPU that schedules the main task before
the thread). To avoid this race, we can simply synchronize with the
api-thread. Since we already have this synchronization as part of the
api-thread takedown, it is as simple as stopping the api-thread before
continuing with operations.

Lastly, if a write operation to a pipe was issued, we are guaranteed
that a SIGCHLD synchronization across processes is ordered correctly.
Furthermore, the python event-loop also guarantees that stopping an
event-loop will necessarily dispatch all outstanding events. A read is
guaranteed to be outstanding in our race-scenario, so the read will be
dispatched. The only possible problem is `_output_ready()` only
dispatching a maximum of 4096 bytes. This might need to be fixed
separately. A comment is left in place.
2020-08-13 14:02:27 +02:00
Christian Kellner
42b20638c0 pipeline: add metadata to the build result
Include metadata, optionally set by modules, in the build result.
2020-08-13 10:50:34 +02:00
chloenayon
fdaa2e1a66 osbuild: require output_directory
Make the output_directory argument in Pipeline.assemble
and Assembler.run required. The qemu assembler assumes
it is passed in args and will crash without it. Making
it mandatory prevents this.
2020-08-07 20:39:14 +02:00