Commit graph

240 commits

Author SHA1 Message Date
Christian Kellner
72e00f3f2b pipeline: pass meta data to stages & assemblers
Pass a new `meta` object to the stages and assemblers that for now
only contains the `id` of the corresponding stage or assembler.
2020-06-10 15:08:49 +02:00
Christian Kellner
a419ee9038 buildroot: grant CAP_MAC_ADMIN for labeling
When applying labels inside the container that are unknown to the
host, the process needs to have the CAP_MAC_ADMIN capability in order
to do so, otherwise the kernel will prevent setting those unknown
labels. See the previous commit for more details.
2020-06-10 01:35:05 +02:00
Christian Kellner
696219dab9 util/ostree: accept typing.List for List[str]
In python 3.6 the value of `__origin__` for typing.List[str] is
typing.List. This then changed to the actual `list` type in later
versions. Accept both versions.
2020-06-09 13:42:35 +02:00
Christian Kellner
52f33d56b7 ostree: add 'initramfs-args' option to Treefile
Add the initramfs-args Treefile option that can be used to pass
arguments to drauct via rpm-ostree. NB: the ostree module will
always be automatically be included by rpm-ostree.
2020-06-04 10:25:39 +02:00
Christian Kellner
5891beab4e meta: also validate the schema for sources
When validating the manifest, now also validate the schema for
the supplied sources.
2020-06-02 09:50:14 +02:00
Christian Kellner
bdae02a6b5 meta: ModuleInfo support for Sources
Add support for querying information about sources: add the mapping
from name to directory and accept "Source" as a module name. Adapt
the ModuleInfo schema property to handle the different styles for
stage-like schemata as well as sources now.
2020-06-02 09:50:14 +02:00
David Rheinsberg
fe6e58aa12 pipeline: drop redundant default arg value
Drop the default argument value for `output_directory`, but use
type-annotations to make clear it can be optional.
2020-05-29 11:07:29 +02:00
David Rheinsberg
13c0dec8ee util/jsoncomm: simplify condition
This reduces `if fds && len(fds) > 0:` to `if fds:`. In python, empty
collections are considered false, so the additional check is not needed.
2020-05-29 11:07:29 +02:00
Christian Kellner
2a9cdde5ec osbuild: refactor stage information
For all currently supported modules, i.e. stages and assemblers,
convert the STAGE_DESC and STAGE_INFO into a proper doc-string.
Rename the STAGE_OPTS into SCHEMA.
Refactor meta.ModuleInfo loading accordingly.

The script to be used for the conversion is:

  --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---

import os
import sys

import osbuild
import osbuild.meta

from osbuild.meta import ModuleInfo

def find_line(lines, start):
    for i, l in enumerate(lines):
        if l.startswith(start):
            return i
    return None

def del_block(lines, prefix):
    start = find_line(lines, prefix)
    end = find_line(lines[start:], '"""')
    print(start, end)
    del lines[start:start+end+1]

def main():
    index = osbuild.meta.Index(os.curdir)

    modules = []
    for klass in ("Stage", "Assembler"):
        mods = index.list_modules_for_class(klass)
        modules += [(klass, module) for module in mods]

    for m in modules:
        print(m)
        klass, name = m
        info = ModuleInfo.load(os.curdir, klass, name)

        module_path = ModuleInfo.module_class_to_directory(klass)
        path = os.path.join(os.curdir, module_path, name)
        with open(path, "r") as f:
            data = list(f.readlines())

            i = find_line(data, "STAGE_DESC")
            print(i)
            del data[i]

            del_block(data, "STAGE_INFO")

            i = find_line(data, "STAGE_OPTS")
            data[i] = 'SCHEMA = """\n'

        docstr = '"""\n' + info.desc + "\n" + info.info + '"""\n'
        doclst = docstr.split("\n")
        doclst = [l + "\n" for l in doclst]
        data = [data[0]] + doclst + data[1:]

        with open(path, "w") as f:
            f.writelines(data)

if __name__ == "__main__":
    main()
2020-05-29 08:37:47 +02:00
Christian Kellner
dd00c4f478 meta: add method to list modules of a given class
New Index.list_modules_for_class method that will list the names
of all the modules of a certain class, like 'Stage' or 'Assembler'.
2020-05-29 08:37:47 +02:00
Christian Kellner
2d5ec8edad meta: extract module class to dir mapping
Make the mapping of module class to the corresponding directory
a method of the ModuleInfo class. This is so it can be re-used
by others in the future.
2020-05-29 08:37:47 +02:00
Christian Kellner
80858a492b meta: rename StageInfo → ModuleInfo
The are converging on a nomenclature where the sum of Stages,
Assemblers, Sources (and future entities like those) together
are called 'Modules'.
Thus rename StageInfo to ModuleInfo and the corresponding
variables and methods.
2020-05-29 08:37:47 +02:00
David Rheinsberg
867adc1596 pipeline: checkpoint assemblers just like stages
Change the assembler-commit to be conditional on checkpoints, just like
we already do for stages. This means, assembler output is not
automatically committed, but only if you requested so via a checkpoint.

With this in place we can start sharing caches in osbuild-composer. The
only thing in the cache will be sources as well as checkpointed stages.
We can start checkpointing known pipelines and thus make use of the
cache. Furthermore, we can cache sources, as long as we do not fetch an
unbound set of RPMs. However, our RPM set is currently static, so this
should not be an issue. Nevertheless, it is up to Composer to decide
when to enable the cache.
2020-05-28 14:55:00 +02:00
David Rheinsberg
9c982dc147 pipeline: fix pylint-warning triggered by rebase
Fix osbuild/pipeline.py unused import. We now trigger pylint warnings
alongside pylint errors, and a PR rebase did not consider this.
2020-05-28 12:29:53 +02:00
David Rheinsberg
4c0b169881 pipeline: drop tree_id from osbuild results
We no longer need the `tree_id` in the osbuild output. All callers have
been converted to use other means. Drop the ID from the output and
avoid exposing our internals.
2020-05-28 11:16:15 +02:00
David Rheinsberg
43ddcf895d pipeline: drop output_id and pull in output-directory
Now that no caller requires the "output_id" anymore, drop it from our
results-dictionary. Instead, pass the output-directory through and copy
outputs where we produce / fetch them.

This still uses `objectstore.resolve_ref()`, since we do not have the
outputs pinned at the places where we want to copy. This needs a little
bit more rework, but we might just delay that until we have the cache
rework landed.

This already simplifies the output-directory path and drops the slight
hack which checked very late for produced outputs.

Note that we must be careful not to copy things too early, because we
do not want remnants in the output-directory if we return failure.
Hence, keep the copy-operation close to the commit-operation on the
store.
2020-05-28 11:16:15 +02:00
David Rheinsberg
18b16acd3f pipeline: drop redundant shortcut
All callsites of `Pipeline.assemble()` already check early whether the
output-object exists in the store and then return it. Checking again in
`assemble()` will never catch anything (unless another stage would
happen to produce the same ID as the assembler as a side-effect).

It does seem useful to keep the shortcuts in `assemble()`, so other
callers would get the shortcut as well. However, this does not really
work well right now, since you want to skip the stage-compilation as
well, and `assemble()` is really just the last step of the job. Hence,
it really is the job of the pipeline-executor to check early.

With that in mind, lets drop this fast-path which has no effect in the
current setup.
2020-05-28 11:16:15 +02:00
David Rheinsberg
46526cf205 osbuild: avoid [] as default value
Using `[]` as default value for arguments makes `pylint` complain. The
reason is that it creates an array statically at the time the function
is parsed, rather than dynamically on invocation of the function. This
means, when you append to this array, you change the global instance and
every further invocation of that function works on this modified array.

While our use-cases are safe, this is indeed a common pitfall. Lets
avoid using this and resort to `None` instead.

This silences a lot of warnings from pylint about "dangerous use of []".
2020-05-28 11:06:05 +02:00
David Rheinsberg
14ada360bd meta: avoid static assertion
Avoid raising a static assertion, but use `raise AssertionError()`
instead. This silences a complaint from pylint about static parameters
to `assert`.
2020-05-28 11:06:05 +02:00
David Rheinsberg
dfec7aca9d buildroot: convert unused format-string
Convert the `f""` into a `""`, since no format identifier is used. This
makes pylint happier!
2020-05-28 11:06:05 +02:00
Christian Kellner
869973dc68 pipeline: drop {tree, output}_id from --inspect js
We want to get rid of `tree_id` and `output_id` because the they
are now considered internals of the store and clients should not
use them directly. NB: they are still there indirectly as the id
of the last stage and the assembler.
Also, the `output_id` was never correct here, because it was the
`tree_id` as well. Ups.
2020-05-20 18:54:56 +02:00
David Rheinsberg
9dfa0e8a61 pipeline: only copy output if there is any
Make sure to verify that the pipeline actually produced any output
before attempting to copy it out. This fixes osbuild running with
`--output-directory` but without assembler.
2020-05-20 14:44:43 +02:00
Christian Kellner
1896047bae sources: pass the library dir to the sources
The idea is that source can themselves spawn other modules, esp.
new secrets modules. For this they need to know the library dir,
aka 'libdir' throughout the osbuild source. Therefore change the
SourceServer to directly get the library directory instead of
just the sub-directory to the sources. Then pass the library
directory to via the JSON API to the source.
Adjust all usage of the SourceServer, including the tests.
2020-05-20 14:43:33 +02:00
David Rheinsberg
d4f40362ec buildroot: drop kwargs from buildroot.run()
Drop the `kwargs` forwarding from buildroot.run() to subprocess.run().
We do not use it other than for `stdin=subprocess.DEVNULL`. Set that
option directly instead.

Doing the kwargs forwarding mixes the argument namespaces and is very
hard to read. It is not clear from the call-site which argument goes to
buildroot.run() and which to subprocess.run().

Lastly, it requires us to manually fetch `check` just to make pylint
happy. Lets just drop this dance and make the API explicit.
2020-05-13 14:17:30 +02:00
Christian Kellner
016d520dda meta: use draft 4 of jsonschema to validate
We currently don't seem to use anything that requires us to use
the draft 7 of the specification. The minimum version that we
need is draft 4, which is also supported by the python-jsonschema
version in RHEL 8.2 (which is 2.6.0).
2020-05-12 22:00:38 +02:00
David Rheinsberg
bc437520cd tmpfs: drop unused module
The osbuild/tmpfs.py module is unused. Drop it.
2020-05-12 11:14:16 +02:00
David Rheinsberg
8a195d7502 util/ctx: extract suppress_oserror()
Extract the `suppress_oserror()` function from the ObjectManager and
make it available as utility for other code as well.

This also adds a bunch of tests that verify it works as expected.
2020-05-11 18:05:12 +02:00
David Rheinsberg
19c74c3e8d cli: drop --build-env argument
Drop the --build-env command-line argument. It is not used by anything.
Furthermore, our manifests now allow embedding build-environments, so
there is little reason to continue supporting this.
2020-05-07 19:52:33 +02:00
Christian Kellner
9fce523f76 main_cli: pass proper libdir to meta.Index
In case `--libdir` is not specified on the command line, and thus
`args.libdir` is `None`, pass the standard `/usr/lib/osbuild` path
to the meta.Index constructor. Otherwise no schema information can
be found.
2020-05-06 20:18:15 +02:00
Christian Kellner
1fa3b88ab1 meta: truth value of Schema includes schema check
The truthiness of the `Schema` object itself now contains the
schema validation as well, i.e. schema is only valid if schema
information is present and said information passes validation.
2020-05-06 15:42:23 +02:00
Christian Kellner
9d08f4faf2 meta: add Schema.check method to check the schema
The _validator member of `Schema` is used as an indicator whether
the provided schema is valid. The `check` method will, in case
that _validator is not set attempt to validate the schema data,
if present and set the _validator member if schema data is set and
validation has passed. On failure, i.e. missing schema information
or invalid schema data, the ValidationResult will contain the
respective error.
2020-05-06 15:42:23 +02:00
Christian Kellner
e3fa1e8e73 osbuild: add --inspect command line option
This option will print the manifest in JSON, including all the ids,
to stdout. It will not build the pipeline, but the input manifest
will be validated and if that fails the validation result will be
return in JSON.
2020-05-06 15:42:23 +02:00
Christian Kellner
4057bfe896 pipeline: description() can optionally include ids
Add a option to all description methods to include the respective
ids in the description. Defaults to False to preserve the original
output which is used in the tests.
2020-05-06 15:42:23 +02:00
Christian Kellner
35a20922f0 osbuild: validate pipeline options
Validate the options of stages and assembler of the pipeline
before running it. A validation failure will abort the run.
Errors are printed in human readable unless `--json` is passed;
For each error a human readable message together with a path
to the object with the error is given. The syntax of the path
is such it can be used via the `jq` command to select the item.
2020-05-06 15:42:23 +02:00
Christian Kellner
e77d95f4b7 osbuild: add meta module for metadata information
This new module contains utilities that help to introspect parts
that constitute the inner parts of osbuild, i.e. its stages
and assembler (which is also considered a type of stage in
this context). It contains the `StageInfo` class that can that
contains meta-information about the individual stage, such as
a short information (`info`), a longer description (`desc`) and
its JSON schema. A new Schema class represents schema data and
has a `validation` method that can be used to validate that json
data conforms to said schema.
A `Index` class can be used to obtain `StageInfo` and `Schema`
for entities identified via `klass` and `name`.
A top level `validate` method is introduced that can validate
manifest data.
Internally it uses the `jsonschema` package so add that as a
requirement and Install this dependency in the CI.
2020-05-06 15:42:23 +02:00
Christian Kellner
b1a8c465ce main_cli: Add missing newlines
PEP-8 demands it.
2020-05-06 15:42:23 +02:00
David Rheinsberg
41a4afd3ba Revert "buildroot: explicitly invoke runners via python3"
This reverts commit 33844711cd.

There are systems were our runners have no standard python3 location
available. They will fix the environment before invoking any further
utilities. Therefore, we cannot rely on `python3 foo.py` to work in our
ad-hoc containers.

This simply reverts the behavior back to using the shebang.
2020-05-05 16:57:50 +02:00
David Rheinsberg
3c56a3b7ac buildroot: insert osbuild-library into PYTHONPATH
Prepare the PYTHONPATH of the build-root container to include the path
to the osbuild library. This way, we no longer need any symlinks or
bind-mounts for the individual modules.
2020-05-04 12:32:25 +02:00
David Rheinsberg
33844711cd buildroot: explicitly invoke runners via python3
Use the python-interpreter explicitly to invoke the runners. This works
around inconsistencies between scripts imported from the host, and the
interpreter taken from a build-root.
2020-05-04 12:32:25 +02:00
David Rheinsberg
975ef7ff54 buildroot: drop unneeded DeviceAllow=
With `--keep-unit` we now run with the privileges and resources of the
caller. We no longer require external services to extend our privileges.
This also means we no longer have to configure our unit sandbox
manually, but simply rely on kernel sandboxing to do the right thing.
2020-04-29 20:31:32 +02:00
David Rheinsberg
d7adf2b3a5 osbuild: split main apart a bit
Split off the argument parser as well as the manifest parser into
helper functions. Drop the pylint hints from the main function now that
it is considerably smaller.
2020-04-28 15:39:54 +02:00
David Rheinsberg
421414ef0b osbuild: extract CLI to prepare for additional entrypoints
This extracts the CLI entrypoint into `main_cli.py` and prepares the
codebase for the introduction of additional entrypoints. This should
not contain any functional changes.

The idea behind this is to add `main_api.py` (and maybe more in the
future), which will be similar to `main_cli.py` but contain the
`osbuild-api` entrypoint. This will make all entrypoints nicely symetric
and the only difference will be `setup.py` selecting the right
entrypoint for each executable, as well as `__main__.py` selecting the
entrypoint for the module itself (which we will keep to the CLI for
compatibility).
2020-04-28 15:39:54 +02:00
David Rheinsberg
4493d057a2 osbuild: move osrelease parser into ./util
Reduce the clutter in our pipeline implementation and move the
os-release parser into ./osbuild/util/.
2020-04-28 15:39:00 +02:00
Christian Kellner
11ffe72aff buildroot: micro cleanups for some strings
Remove an unnecessary format string modifier and unify the usage
of quotation marks.
2020-04-24 15:49:03 +02:00
David Rheinsberg
2bdb287731 sources: make cleanup explicit
This changes the sources module to explicitly cleanup event-loops.
Additionally, the implementation is protected against re-entrency which
we do not support (and do not need).

We did occasionally get the following exception when running
source-servers:

        /usr/lib/python3.8/asyncio/base_events.py:654: ResourceWarning: unclosed event loop <_UnixSelectorEventLoop running=False closed=False debug=False>
          _warn(f"unclosed event loop {self!r}", ResourceWarning, source=self)
        ResourceWarning: Enable tracemalloc to get the object allocation traceback

        Exception ignored in: <function BaseEventLoop.__del__ at 0x7f92589d14c0>
        Traceback (most recent call last):
          File "/usr/lib/python3.8/asyncio/base_events.py", line 656, in __del__
            self.close()
          File "/usr/lib/python3.8/asyncio/unix_events.py", line 58, in close
            super().close()
          File "/usr/lib/python3.8/asyncio/selector_events.py", line 92, in close
            self._close_self_pipe()
          File "/usr/lib/python3.8/asyncio/selector_events.py", line 99, in _close_self_pipe
            self._remove_reader(self._ssock.fileno())
          File "/usr/lib/python3.8/asyncio/selector_events.py", line 274, in _remove_reader
            key = self._selector.get_key(fd)
          File "/usr/lib/python3.8/selectors.py", line 190, in get_key
            return mapping[fileobj]
          File "/usr/lib/python3.8/selectors.py", line 71, in __getitem__
            fd = self._selector._fileobj_lookup(fileobj)
          File "/usr/lib/python3.8/selectors.py", line 225, in _fileobj_lookup
            return _fileobj_to_fd(fileobj)
          File "/usr/lib/python3.8/selectors.py", line 42, in _fileobj_to_fd
            raise ValueError("Invalid file descriptor: {}".format(fd))
        ValueError: Invalid file descriptor: -1

This is triggered when an event-loop is not closed explicitly via
`event_loop.close()`. It then tries to cleanup explicitly. The problem
here is that python has no knowledge of in which order it should
collect GC'ed objects. This might end up more or less random. Therefore,
file-descriptors might be closed in arbitrary order, leading to the
event-loop being unable to unregister its internal objects.

I am not entirely sure whether this is the case here. However, the error
definitely triggers on the internal event-loop socketpair, which there
is no other external access to. Furthermore, this socketpair is only set
to -1 in its own __del__ function. So unless we have a memory
corruption, I see nothing else that could trigger this.

With this fix in place, I can run `test_sources.py` in a loop without
triggering the bug.

It is quite likely that our other `*Server` classes need the same fix. I
did not verify, yet.
2020-04-24 11:48:16 +02:00
David Rheinsberg
f12c57c1fd buildroot: reduce nspawn requirements further
This adds one more flags to `systemd-nspawn`:

    --keep-unit
        This prevents nspawn from creating its own scope unit and
        instead uses the scope of the caller. Since we want nspawn to
        run with the privileges of the caller, this is fitting for our
        use case.
        Furthermore, this makes nspawn work without a running system
        bus, since it no longer needs to talk to systemd pid1.

        (introduced with systemd-v209)

With this in place, osbuild can be run from within docker containers (or
other containers without systemd as pid1). This still requires some
extra setup, but this can all be done in the container manager.
2020-04-22 18:27:10 +02:00
David Rheinsberg
e2aa7d8128 util: mark as module
Make sure ./osbuild/util/ is considered a python module. Add
__init__.py to declare it as such and include it in the module-list of
setup.py.
2020-04-21 17:00:04 +02:00
David Rheinsberg
2624be92dc osbuild: cleanup contextlib usage
Two cleanups for the context-managers we use:

  * Use `contextlib.AbstractContextManager` if possible. This class
    simply provides a default `__enter__` implementation which just
    returns `self`. So use it where applicable.
    Additionally, it provides an abstract `__exit__` method and thus
    allows static checks for an existance of `__exit__` in the dependent
    class. We might use that everywhere, but this is a separate
    decision, so not included here.

  * Explicitly return `None` from `__exit__`. The python docs state:

        If an exception is supplied, and the method wishes to suppress
        the exception (i.e., prevent it from being propagated), it
        should return a true value. Otherwise, the exception will be
        processed normally upon exit from this method.

    That is, unless we want exceptions to be suppressed, we should
    never return a `truthy` value. The python contextlib suggest using
    `None` as a default return value, so lets just do that.

    In particular, the explicit `return exc_type is None` that we use
    has no effect at all, since it only returns `True` if no exception
    was raised.

    This commit cleans this up and just follows what the `contextlib`
    module does and returns None everywhere (well, it returns nothing
    which apparently is the same as returning `None` in python). It is
    unlikely that we ever want to suppress any exceptions, anyway.
2020-04-21 16:02:20 +02:00
David Rheinsberg
1cdf2be0ac util/rmrf: use immutable helpers
Make use of the new immutable-flag ioctl helpers. While at it, move the
`chmod` to `fchmod` and re-use the open file-descriptor. Document the
behavior and move the `fchmod` into its own try-block for the same
reasons as the `ioctl` call: We rely on the following unlink() to catch
any errors. Errors in the fixperms() step are non-consequential.
2020-04-21 14:46:02 +02:00
David Rheinsberg
e5ff287f5a util/linux: add explicit FS_IMMUTABLE_FL helpers
The FS_IOC_{GET,SET}FLAGS ioctl numbers are not stable across different
architectures. Most of them use the asm-generic versions, but ALPHA and
SPARC in particular use completely different IOC number setups (see the
definition of _IOC, _IOR, _IOW, etc. in the kernel).

This commit moves the helpers for `FS_IMMUTABLE_FL` into
`osbuild/util/` and adds explicit tests. This will make sure that we
catch any ioctl mismatches as soon as possible when we run the osbuild
test-suite on other architectures. Until then, we will have to live with
this mismatch.
2020-04-21 14:46:02 +02:00