Similar to the `stage_module` fixture for stage tests this adds
a fixture to test sources modules of osbuild.
The code from `stage_module` and `sources_module` is similar and
could be combined but pytest makes it hard to do this without
having a shared root dir. Given that it's just four lines it
seems easier to just life with the tiny bit of code duplication.
This source checks for the existence of a local container in the host's
containers-storage. The source first reads the host's
`/etc/containers/storage.conf` file for the storage config and then
checks if the user has imported the desired container into the local
store.
Unlike the org.osbuild.containers stource, the
org.osbuild.containers-storage source doesn't need any extra data other
than the image ID. The ID is all that is used to retrieve the
container. The location and other information regarding the storage are
read from the host configuration and are not encoded in the manifest
There's no need to use the name to resolve it like we do in other
sources because containers in the local storage can be directly
referenced by their image id (config digest).
Other data such as the name of the container will only be relevant in
the stage that will use the container as input.
The source items are objects instead of simple strings of checksums
because we might, in the future, want to add specific options for each
source.
The content_type for this source is `containers-storage`, which defines
the location in the store where the source will bind mount the host's
container storage for stages to read. We make this different from the
containers content because it will be treated differently enough to need
a separate input type.
Co-authored-by: Gianluca Zuccarelli <gzuccare@redhat.com>
Co-Authored-By: Michael Vogt <michael.vogt@gmail.com>
For local images copied from an image store other than the default, we
need to be able to specify the `storage-location`. This commit enables
this functionality.
Jira: https://issues.redhat.com/browse/HMS-3235
Update the skopeo sources stage to check other container-transports [1]
for downloading container images. This commit adds a transport for
`containers-storage` in addition to the existing `docker://`
transport.
Jira: https://issues.redhat.com/browse/HMS-3235
[1] CONTAINERS-TRANSPORTS(5)
A new source module that can download a multi-image manifest list from a
container registry. This module is very similar to the skopeo source,
but instead downloads a manifest list with `--multi-arch=index-only`.
The checksum of the source object must be the digest of the manifest
list that will be stored and the manifest that is downloaded must be a
manifest-list.
Change the local storage format for containers to the `dir` format.
The `dir` format will be used to retain signatures and manifests.
The remove-signatures option is removed since the storage format now
supports them.
The final move (os.rename()) at the end of the fetch_one() method now
creates the checksum directory if it doesn't exist and moves the child
archive into it, adding to any existing archives that might exist in
other formats (from a previous version downloading a `docker-archive`).
Dropped the .tar suffix from the symlink in the skopeo stage since it's
not necessary and the target of the link might be a directory now.
The parent class exists() method checks if there is a *file* in the
sources cache that matches the checksum. For containers, this used to
be a file called container-image.tar under a directory that matches the
checksum, so for containers it always returned False. Added an override
for the skopeo source that checks for the new directory archive.
The consumer certs are used to uniquely identify a system against
candlepin. These consumer certs can be used to identify the system when
pulling from RH controlled ostree repositories.
Instead of downloading the image directly to the temporary directory
and then moving that temporary directory into the cache use one more
intermediate directory and move that into the cache. The reason is
that on Python 3.6 removing the temporary directory itself will make
Python crash like this:
Python 3.6.8 (default, Sep 9 2021, 07:49:02)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> with tempfile.TemporaryDirectory(prefix="tmp-download-") as tmpdir:
... import os
... os.rename(tmpdir, "/tmp/foo")
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/usr/lib64/python3.6/tempfile.py", line 809, in __exit__
self.cleanup()
File "/usr/lib64/python3.6/tempfile.py", line 813, in cleanup
_shutil.rmtree(self.name)
File "/usr/lib64/python3.6/shutil.py", line 477, in rmtree
onerror(os.lstat, path, sys.exc_info())
File "/usr/lib64/python3.6/shutil.py", line 475, in rmtree
orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp-download-adl86mwa'
Use `subprocess.check_output` instead of `run(..., capture_output=True)`
since the latter only got added in Python 3.7 and our codebase needs to
be compatible with 3.6 due to RHEL 8.x.
Add support for the `--insecure` curl flag, which makes curl skip the
verification step when making secure connections (e.g., https://).
This allows osbuild to download files from servers configured with
SSL/TLS but whose certificate cannot be validated.
This is supported for configuring repository sources in
osbuild-composer.
The generic ways of checking if an object is in the cache does not apply
for ostree as the internal structure of a repo is quite specific. Thus
we need to use the ostree executable to ask it to explore its repo for
us.
Before, the download method was defined in the inherited class of each
program. With the same kind of workflow redefined every time. This
contribution aims at making the workflow more clear and to generalize
what can be in the SourceService class.
The download worklow is as follow:
Setup -> Filter -> Prepare -> Download
The setup mainly step sets up caches. Where the download data will be
stored in the end.
The filter step is used to discard some of the items to download based
on some criterion. By default, it is used to verify if an item is
already in the cache using the item's checksum.
The Prepare step goes from each element and let the overloading step the
ability to alter each item before downloading it. This is used mainly
for the curl command which for rhel must generate the subscriptions.
Then the download step will call fetch_one for each item. Here the
download can be performed sequentially or in parallel depending on the
number of workers selected.
Introduce a new class member `content_type` that specifies what type of
items the source will store in the cache. Use that to generalize the
setup step, which is shared across all sources.
Some RPMs might be very large, and limiting the total download time
might lead to failed build even in cases where downloading is making
progress. Instead, set a minimum download speed (1kbps). If the
minimum is not surpassed for 30 seconds in a row, the download fails
and is retried. This follows the logic employed by DNF.
Adjust the number of retries to 10 and the connection timeout to 30,
in order to match what DNF does. One difference is that DNF does 10
retries across all downloads, whereas we do it per download, this
could be changed in a follow-up.
Old:
- a download taking more than 5 minutes is unconditionally aborted
New:
- slow but working downloads will never be aborted
- downloads will be stalled for at most five minutes
in total before being aborted
- time spent making progress does not count towards
the five minutes
Signed-off-by: Tom Gundersen <teg@jklm.no>
This adds a stage called org.osbuild.skopeo that installs docker and
oci archive files into the container storage of the tree being
constructed.
The source can either be a file from another pipeline, for example one
created with the existing org.osbuild.oci-archive stage, or it can
be using the new org.osbuild.skopeo source and org.osbuild.containers
input, which will download an image from a registry and install that.
There is an optional option in the install stage that lets you
configure a custom storage location, which allows the use of the
additionalimagestores option in the container storage.conf
to use a read-only image stores (instead of /var/lib/container).
Note: skopeo fails to start if /etc/containers/policy.json is
not available, so we bind mount it from the build tree to the
buildroot if available.
Port sources to also use the host services infrastructure that is
used by inputs, devices and mounts. Sources are a bit different
from the other services that they don't run for the duration of
the stage but are run before anything is built. By using the same
infrastructure we re-use the process management and inter process
communcation. Additionally, this will forward all messages from
sources to the existing monitoring framework.
Adapt all existing sources and tests.
This moves the check for already downloaded files earlier so
that if all files are already downloaded we don't need to
load the secrets.
This is faster, but also it allows a pre-seeded object store
to run the manifest on a system (like a VM) that isn't subscribed.
The previous version covered too few use cases, more specifically a
single subscription. That is of course not the case for many hosts, so
osbuild needs to understand subscriptions.
When running org.osbuild.curl source, read the
/etc/yum.repos.d/redhat.repo file and load the system subscriptions from
there. While processing each url, guess which subscription is tied to
the url and use the CA certificate, client certificate, and client key
associated with this subscription. It must be done this way because the
depsolving and fetching of RPMs may be performed on different hosts and
the subscription credentials are different in such case.
More detailed description of why this approach was chosen is available
in osbuild-composer git: https://github.com/osbuild/osbuild-composer/pull/1405
Add a new source for transporting binary data within the source
entry itself. The data is ascii encoded in the `data` property
of the inline source item, with the encoding that is used being
specified in the `encoding` property.
Now that there is a common utility function to verify the checksum
of a file, use that.
Also fix the json schema entry for the property to have to correct
minium and maximum digest length, given the supported algorithm,
which is 32 (md5) and 128 (sha512) characters.
Since the `sources.SourcesServer` has been removed, nothing is
using the export functionality anymore. Inputs are now used to
make content in the store available to stages. Remove all the
export logic from org.osbuild.ostree.
Since the `sources.SourcesServer` has been removed, nothing is
using the export functionality anymore. Inputs are now used to
make content in the store available to stages. Remove all the
export logic from org.osbuild.curl.
Instead of using stderr for the ostree subprocess command
capture its output so that in the case of an error we get
properly return the error output. With the old behavior
all the `ostree` command output would land in the journal
of the worker.
Source, for compatability reasons, have two modes: download only
and download and export. The difference is the arguments that
are passed to the source: For download only, the `output` param
is empty. In this case also `checksums` *can* be empty and if so
it means everything, i.e. the commits, should be fetched. The
latter was not properly handled so far. Adjust the logic, which
now closely mimics that of the `org.osbuild.curl` source to fix
this case.
Also catch exceptions invoking `ostree` and properly return them
via the json error messaging.
The `org.osbuild.files` source provides files, but might in the
future not be the only one that does. Therefore rename it to
match the internal tool that is being used to fetch the files.
This is done for most other osbuild modules that target tools.
The format v1 loader is adapted to make this change transparent
for users of the v1 format, so we are backwards compatible.
Change the MPP depsolve preprocessor so that for format v2 based
manifest `org.osbuild.curl` source is used. Also rename the
corresponding source test. Adapt the format v2 mod test to use
the curl source.
Instead of supplying the full cache dir, i.e. the directory in
the store where the source will place the fetched resources, to
the source, only supply the root folder of the cache and let
the source itself create the desired sub-directory. This allows
the source to determine what type of resource it provides. This
makes the final directory independent of the name of the source:
a `org.osbuild.curl` source can place file-like resource in the
`org.osbuild.files` sub-directory. Then the `org.osbuild.files`
input can be used to get those from the cache directory.
In format version 2, the source specific keys for the sources,
here "urls", is replaced by a generic `items` key, common to
all sources. Express that in the schema.
In format version 2, the source specific keys for the sources,
here "urls", is replaced by a generic `items` key, common to
all sources. Express that in the schema.