This reverts commit bc04bfc366.
The `remove_tmpfiles()` helper is nice but it is also problematic
because it creates extra output after the command was run and
created output. E.g. a test failure on centos stream9 [0]
```
r = root.run(["stat", "--format=%a", "/var/tmp"], monitor)
assert r.returncode == 0
> assert r.stdout.strip().split("\n")[-1] == "1777"
E AssertionError: assert '/usr/lib/tmp... such process' == '1777'
E
E - 1777
E + /usr/lib/tmpfiles.d/rpcbind.conf:2: Failed to resolve user 'rpc': No such process
```
Here the output from "stat" is not the last output because the
rempve_tmpfiles runs `systemd-tmpfiles --clean --remove` which
produces some noisy output after stat was run.
This was found by @thozza (thanks!) and discussed in osbuild PR#1785.
There are various ways to fix this, the one is to use the
`--graceful` option of systemd-tmpfiles. However that only got added in
systemd v256 and centos-stream9 has v252 so that is sadly not an option.
Plus even when avaialble it will produce some informational output like
```
All rules containing unresolvable specifiers will be skipped.
```
Another way would be to sent the output from systemd-tmpfiles cleanup
to /dev/null. Not really great as we will not know about real problems
or warnings that we should care about.
None of the option above is good. So I started looking at the tmpfiles.d
rules and the cleanup and why we are doing it. It was added relatively
recently in https://github.com/osbuild/osbuild/pull/1458 and after
some medidiation not having it seems to do no harm (details below). The
tl;dr is that the buildroot is created inside bubblewrap and the
dirs that `--clean` and `--remove` touch are already tmpdirs created
just for the buildroot so the cleanup in the runner is redundant
(and because the cleanup is now run for each buidlroot.run() command
there *might* be unintended conequences but the current rules seem
to not have any).
In detail, the tmpfiles_cleanup() does two things:
1. `--clean`
It will remove files that are older then the given age
in tmpfiles.d. The tmpfiles in centos9 give me the following ages:
```
$ systemd-tmpfiles --cat-config|grep -E '[0-9]+d$'
d /var/lib/systemd/pstore 0755 root root 14d
d /var/lib/systemd/coredump 0755 root root 3d
q /tmp 1777 root root 10d
q /var/tmp 1777 root root 30d
D! /tmp/.X11-unix 1777 root root 10d
D! /tmp/.ICE-unix 1777 root root 10d
D! /tmp/.XIM-unix 1777 root root 10d
D! /tmp/.font-unix 1777 root root 10d
```
Given that we run our commands inside a bubblewrap environment and
give it a fresh /run, /tmp, /var [1] there really should be no long
lived things and even if there are they are cleaned up from the
buildroot itself
2. `--remove`
It will remove files marked for removal in tmpdfiles.d. Running
it on a centos9 env it yields for me:
```
$ systemd-tmpfiles --cat-config|grep -E '^[rRD]'
R /var/tmp/dnf*/locks/*
r /var/cache/dnf/download_lock.pid
r /var/cache/dnf/metadata_lock.pid
r /var/lib/dnf/rpmdb_lock.pid
r /var/log/log_lock.pid
r! /forcefsck
r! /fastboot
r! /forcequotacheck
D! /var/lib/containers/storage/tmp 0700 root root
D! /run/podman 0700 root root
D! /var/lib/cni/networks
R! /var/tmp/container_images*
D /run/rpcbind 0700 rpc rpc - -
D /run/sudo/ts 0700 root root
R! /tmp/systemd-private-*
R! /var/tmp/systemd-private-*
r! /var/lib/systemd/coredump/.#*
D! /tmp/.X11-unix 1777 root root 10d
D! /tmp/.ICE-unix 1777 root root 10d
D! /tmp/.XIM-unix 1777 root root 10d
D! /tmp/.font-unix 1777 root root 10d
r! /tmp/.X[0-9]*-lock
```
which is also covered by the bwrap cleanup.
[0] https://artifacts.dev.testing-farm.io/2d07b8f3-5f52-4e61-b1fa-5328a0ff1058/#artifacts-/plans/unit-tests
[1] https://github.com/osbuild/osbuild/blob/main/osbuild/buildroot.py#L218
autosd is a CentOS Stream 9 derivate. User reported:
"ValueError: No suitable runner for org.osbuild.autosd"
in Automotive SIG community Matrix. We are going through some name
changes at the moment.
This fixes an issue where Fedora-38 hosts can not build CentOS-Stream-9
images due to an incompatible gpg key with the new default settings for
rpm.
On Fedora-38, rpm has changed to use a new backend for key verification
and by default does not support SHA1 anymore, although the support for
SHA1 can be re-enabled via a config file. The (current) CentOS-Stream-9
keys however still require SHA1 support in order to be importable. So
they are now unusable on Fedora-38 unless SHA1 support is re-enabled.
In OSBuild, the initial chroot does not contain the config files and so
SHA1 support is disabled when rpmkeys from the host is called. It does
not matter if the crypto-policies on the host machine is configured with
the exception to support SHA1 because the chroot filters that out. This
means it may not be possible to assemble CentOS-Stream-9 based images
without disabling the key check.
This patch adds an explicit conditional case for Fedora-38 to inject the
needed configuration file into /etc/crypto-policies/back-ends to enable
SHA1 support for rpm by default. It does this by copying the default
policies from /usr/share/crypto-policies. The result is OSBuild behaving
similar to the previous behaviour seen on Fedora-37 and earlier.
Now that we can automatically detect the best available runner for
a requested one, we don't need to maintain the link farm with the
explicit mapping anymore.
Create a runner for RHEL7. The one thing to note is that RHEL 7
makes use of ld.so.confd snippets and one important for us is
to include `/usr/lib64/iscsi` needed by qemu-img. Otherwise this
is a fairly simple and straight forward runner.
Actually, rename the rhel90 runner to the centos9 runner, and
make the former a link to the latter, since in RHEL 9, CentOS
is the upstream and RHEL the downstream.
Work around a bug on aarch64[1] where `qemu-img` would hang
about a third of the time when converting images. To be able
to activate the work-around based on the environment, i.e.
only on certain distributions, introduce an environment
variable, `OSBUILD_QEMU_IMG_COROUTINES`, that is set in the
runner and then picked up in the assembler.
[1] https://bugs.launchpad.net/qemu/+bug/1805256
This runner is used by both CentOS 8 and CentOS Stream. CentOS is kinda weird
because it specifies only number 8 in VERSION_ID in /etc/os-release unlike
RHEL. Also the ID is the same for CentOS 8 and CentOS Stream.
This should work fine for now though:
CentOS 8 is currently based on RHEL 8.3 and CentOS Stream on devel version of
RHEL 8.4. For both RHEL 8.3 and 8.4 we use the RHEL 8.2 runner so it should be
safe to assume that it's OK to base the CentOS 8 runner also on the RHEL 8.2
one.
We might need to tweak this at some point but I suggest dealing with it when
that time comes.
Wrap all calls to the various setup functions in the new exception
handler provided by `osbuild.api`. This will make sure that any
exception is properly printed to stderr, as well as communicated
to osbuild in a structured and machine readable way.
Now that the BuildRoot is capable of capturing the output of the
runner and modules (stages, assemblers), there is no need for
using `api.setup_stdio`. Therefore, drop it from all runners and
replace `api.output` with `BuildRoot.output`, which will contain
the output if `api.setup_stdio` is not called from the runners.
Runner are invoked to prepare the execution of stages and assemblers
inside the container. The setup tasks are specific to the distribution
and maybe the version of it, therefore specific runners are used for
each distribution+version combination.
The build the first (most nested) build root, `/usr` is taken from the
host to bootstrap the container. On RHEL, the python interpreter to be
used for software that belongs to the platform is platform-python, as
it provides a stable API. Therefore the RHEL runners should use that
instead of relying on the presence of /usr/bin/python3.6, which might
not be installed and is indeed not installed by default.
Drop the `osbuild -> ../osbuild` symlink from all module directories.
We now properly initialize the PYTHONPATH to provide the imported
osbuild module from the host environment. Therefore, these links are no
longer needed.
The sources run from the host environment, so they should just pick them
up from the environment the same way osbuild itself does.
We want to run stages and other scripts inside of the nspawn containers
we use to build pipelines. Since our pipelines are meant to be
self-contained, this should imply that the build-root must have osbuild
installed. However, this has not been the case so far for several
reasons including:
1. OSBuild is not packaged for all the build-roots we want to support
and thus we have the chicken-and-egg problem.
2. During testing and development, we want to support using a local
`libdir`.
3. We already provide an API to the container. Importing scripts from
the outside just makes this API bigger, but does not change the
fact that build-roots are not self-contained. Same is true for the
running kernel, and probably much more..
With all this in mind, our strategy probably still is to eventually
package osbuild for the build-root. This would significantly reduce our
API exposure, points-of-failure, and host-reliance. However, this switch
might still be some weeks out.
With this in mind, though, we can expect the ideal setup to have a full
osbuild available in the build-root. Hence, any script we import so far
should be able to access the entire `libdir`. This commit unifies the
libdir handling by installing the symlinks into `libdir` and providing
a single bind-mount of the module-path into `libdir`.
We can always decide to scratch that in the future when we scratch the
libdir-import from the host-root. Until then, I believe this commit
nicely unifies the way we import the module both in a local checkout as
well as in the container.
Now that stages no longer access the network, drop CA certificate
setup.
In the future, we may want to restrict all network access to the
container, but that requires more work.
Signed-off-by: Tom Gundersen <teg@jklm.no>
This adds a new runner for Arch Linux. For now this simply links to the
blank linux runner, which works perfectly fine to bootstrap more
complex build pipelines.
Note that if we ever end up with more complex pipelines native to Arch
Linux, we might have to update this runner as well, since even on Arch
/etc must be pre-populated. Regardless, the blank linux runner serves
as a nice base and allows us to easily bootstrap osbuild on foreign
distros.
Now with `os-release` using `linux` as default ID+VERSION string, we
have a proper fallback name for our blank runner. Rename the blank
runner to `org.osbuild.linux`. It now serves as default fallback for
anything not further specified.
This adds a new runner called `org.osbuild.blank`, which assumes /usr
is pre-populated and ready to go. It does not perform any OS setup. It
only initializes the environment and executes the stage.
This runner allows easy bootstrapping of new systems. It assumes our
ideal setup where `/usr` describes a host system in its entirety,
without any local policy applied. Thus, this runner is also what we
ultimately want to work towards as a default. This might not happen
anytime soon, though, given how `passwd`, `ldconfig`, `nss`, etc. still
depend on prepopulated caches in `/etc`.