When calculating the checksum of the stage, the mount options were
not included. This was maybe deliberate, because if the mounts of
a stage change, it is very likely that previous stages change too.
But the introduction of non-device mounts, like ostree.deployment,
have changed the setting, since the content of the tree will be
different if that mount is applied or not. And even for the device
based mounts it will change the tree if e.g. a device is mounted
at at different path but otherwise is formatted with the very same
options. In the worst case we miss a few cache hits due to changes
in the mount setup that don't lead to tree changes, but that will
rarely happen in practice.
We need to initialize `schema` to `None`, otherwise it will be an access
to an uninitialized variable when looking up invalid schemata:
[...]
File "[...]/osbuild/meta.py", line 583, in get_schema
schema = Schema(schema, name or klass)
UnboundLocalError: local variable 'schema' referenced before assignment
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
The schema input of Schema.__init__ is a python-native representation
of a JSON object, so it can be any kind of dictionary. Furthermore, it
is optional.
Fix the type to be Optional[Dict].
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Make sure all --help output is consistent. In this particular case,
each line should consistently start with a lower-case character and
avoid a leading `the`.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
This adds a `osbuild --version` command that prints the current osbuild
version in use. Allows users to confirm their osbuild is up to date
enough to use newer features.
Introduce a new class to manage inputs, `InputManger` and move the
code to map inputs from the `Input` here. The main insight of why
the logic should be place here is that certain information is needed
to map inputs, independently of specific type: the path to the input
directory, `root`, the store API, `storeapi` and the service manager
instance to start the actual service. Instead of passing all this
information again and again to the `Input` class, we now have a
specialized (service) manager class for inputs that has all the
needed information all the time.
Check for existing checkpoint in `Pipeline.build_stages` by trying to
get the object, instead of just checking for its existence. Later, if
no checkpoints were found, i.e. `tree` is `None`, create a new object.
This avoids mixing of new object creation and object access.
Instead of iterating over the stages via indices, iterate over the
stages directly. To be able to do so, collect the stages that need
to be built in a deque and then drain it from the other end.
Also invoke `monitor.finish` when the pipeline failed to built.
There is no need to not invoke it in that case. This also will
allow us to print some information in the monitor in tha case.
Since neither a build tree, nor the actual tree is returned from
`build_stages` the short circuit code that checks if the tree is
already present in the store, can be moved before the build tree
retrival. As a result, the short-circuit check in `Pipeline.run`
is now redundant. It was there to make sure that if we have the
tree associated with a pipeline, its build pipeline would also
not be needed. With the short-circuit now happening before the
access of the build pipeline in `build_stages` this is ensured.
In the previous data model the build pipelines were nested inside
the pipeline and thus we would recurse in `build_stages`. The
tree that was built was returned and potentially became the build
tree for the pipeline that invoked `build_stages`. In the new
model of a direct acyclic graph of pipelines the build tree can
be any previously built pipeline and we just get it via the store,
which now keeps track of all previously built pipelines even if
there are not committed to it. Thus there is no need to return
the trees from `build_stages` anymore.
Adjust the short code that does the short circuit check to use
`ObjectStore.contains` instead of `ObjectStore.get` since we
do not need to object anymore.
The pipeline data model used to have an assembler optionally
associated with the pipeline; therefore we had to return the
build tree used to to build the stages since the same build
tree also needed to be used from the assembler. In the "new"
model (first introduced in version 27), the assembler got
replaced by another "normal" pipeline. Since then, there is
no need to return the build tree anymore. Remove it.
Instead of serializing the `BuildResult` to a dict in `build_stages`,
we keep the object and then only serialize it in the corresponding
formatting code. This doubles down on the separation between the
internal data structures and the external representation of them. It
was partially already done in the v2 format which hand-picked which
elements of the BuildResult it would return for each stage.
Remove the stage options from the `BuildResult` object. They were
only serialized in the case of version 1 and not actually used by
Composer for anything. Use of v1 manifests should very limted now
anyway.
Show the stage name (if one is set) when failing the stage in the
validator. This closes#1007, example output:
```
€ python3 -m osbuild supakeen-os.json
supakeen-os.json has errors:
pipelines[0].stages[0]
could not find schema information for 'org.osbuild.rpmb'
.pipelines[0].stages[0].inputs.packages:
could not find schema information for 'org.osbuild.filesz'
```
Before, the download method was defined in the inherited class of each
program. With the same kind of workflow redefined every time. This
contribution aims at making the workflow more clear and to generalize
what can be in the SourceService class.
The download worklow is as follow:
Setup -> Filter -> Prepare -> Download
The setup mainly step sets up caches. Where the download data will be
stored in the end.
The filter step is used to discard some of the items to download based
on some criterion. By default, it is used to verify if an item is
already in the cache using the item's checksum.
The Prepare step goes from each element and let the overloading step the
ability to alter each item before downloading it. This is used mainly
for the curl command which for rhel must generate the subscriptions.
Then the download step will call fetch_one for each item. Here the
download can be performed sequentially or in parallel depending on the
number of workers selected.
Introduce a new class member `content_type` that specifies what type of
items the source will store in the cache. Use that to generalize the
setup step, which is shared across all sources.
Silence pylint warning W0201 (attribute-defined-outside-init) in
`set_status`; it sets dynamic attributes on the LoopInfo class
which pylint does not recognize.
Add a new class `SubIdsDB` as a database of subordinate Ids, like the
ones in `/etc/subuid` and `/etc/subgid`. Methods to read and write
data from these two files are provided.
Add corresponding unit tests.
Drop `CAP_MAC_ADMIN` from the default capabilities which is needed
to write and read(!) unknown SELinux labels. Adjust the stages
that need to read or write SELinux labels accordingly.
Drop CAP_{NET_ADMIN,SYS_PTRACE} from the default capabilities which
are only needed to run bwrap from inside a stage which is done by
the `ostree.commit` and `ostree.preptree` stages, so retain them
directly there.
Add new stage metadata `CAPABILITIES` where stages can request
additional capabilities that are not in the default set.
Currently this is not used by any stage since the default set
contains the sum of all needed capabilities.
Drop all capabilities that are not required by any of the stages.
N.B. at least one stage (`ostree.preptree`) itself executes bwrap
itself, which in turn needs `CAP_SYS_PTRACE` and `CAP_NET_ADMIN`.
Add a new member variable `caps` that if not `None` indicates the
capabilities to retain, i.e. all other capabilities not specified
will be dropped via `bubblewrap` (`--cap-drop`).
Add corresponding tests.
This extends the possible ways of passing references to inputs. The
current ways possible are:
1) "plain references", an array of strings:
["ref1", "ref2", ...]
2) "object references", a mapping of keys to objects:
{"ref1": { <options> }, "ref2": { <options> }, ...}
This patch adds a new way:
3) "array of object references":
[{"id": "ref1", "options": { ... }}, {"id": ... }, ]
While osbuild promises to preserves the order for "object references"
not all JSON serialization libraries preserve the order since the
JSON specification does leave this up to the implementation.
The new "array of object references" thus allows for specifying the
references together with reference specific options and this in a
specific order.
Additionally this paves the way for specifying the same input twice,
e.g. in the case of the `org.osbuild.files` input where a pipeline
could then be specified twice with different files. This needs core
rework though, since internally we use dictionaries right now.
This is a left-over from the time when `systemd-nspawn` was used,
which only retained a limited set of capabilities which did not
include `CAP_MAC_ADMIN`[1]. Bubblewrap, on the other hand, retains
all currently capabilities if the process is run as root[2].
[1] see e.g. src/nspawn/nspawn.c#L147 of commit c52950c
[2] see commit abc56644566a6095bb72a5bf70fcee7dd90e9447