Commit graph

89 commits

Author SHA1 Message Date
Michael Vogt
352bf5cd52 curl: rename "transform" to "amend_secrets"
The curl source is the only source left that uses "transform". And
here the name is very generic but in fact we only do a single thing:
we add secrets for subscriptions for for mtls to the download.

So rename to make it clear what this is all about.
2024-03-19 14:21:57 +01:00
Michael Vogt
1fc7ead2f4 sources: transform() is only used in the curl sources, remove from ABC 2024-03-19 14:21:57 +01:00
Michael Vogt
fd0167f130 test: return container_id in make_container
The current `make_container()` helper is a bit silly (which is
entirely my fault). It requires a container tag as input but all
tests end up creating a random number for this input. So instead
just remove the input and return the container_id from the podman
build in the contextmanager and use that.
2024-03-18 20:36:19 +01:00
Michael Vogt
f214c69a98 osbuild: add workaround to integrate sources into progress reporting
This commit is somewhat poor, sorry for that. It mostly adds
workaround so that the osbuild sources can emit some progress
reporting as well. Without that the user experience is rather poor
and there is a long delay before any sort of progress can be
reported (even before the normal stages run).

With it the user experience is still not good but slightly better,
i.e. the progress monitor will report that the sources have
started downloading and curl will generated some log output. No
real progress unfortunately (sources subprogress will jump from
zero to 100%).
2024-03-12 16:44:12 +01:00
Sanne Raymaekers
29159189f1 sources/curl: add org.osbuild.mtls secrets support
If `org.osbuild.mtls` is passed as a secret name, look for the mtls data
in the environment.
2024-03-11 11:09:37 +01:00
Michael Vogt
82f2414637 sources: tweak ContainersStorageSources.exists to return False
When an images does not exist just return `False` instead of
raising a RuntimeError. If anything else goes wrong (unknown
output or hash mismatch) keep the RuntimeError as this is an
unexpected exception.
2024-02-27 15:07:42 +01:00
Michael Vogt
5ab0b41456 sources: add test for non-existing id in containers-storage 2024-02-27 15:07:42 +01:00
Michael Vogt
0ac05ecb55 sources: tweak docstring for containers-storage 2024-02-27 15:07:42 +01:00
Tomáš Hozza
2b868fbe75 Sources/containers-storage: make the code Python 3.6 compliant
The source implementation used `subprocess.run()` argument
`capture_output`, which was added in Python 3.7. Since the minimum
supported Python version for osbuild on RHEL-8 is 3.6, the stage fails
with TypeError.

Example failure: https://artifacts.dev.testing-farm.io/c147b608-c40e-46ed-bf11-6b15ecf718dc/

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-02-25 09:27:23 +01:00
Achilleas Koutsou
591593ea00 testutil: make_container context manager
Make make_container a context manager so we can reliably clean up
containers that were created in tests.
2024-02-21 17:55:37 +01:00
Michael Vogt
d6cd4b93ba test: add minimal test for ContainersStorageSource.from_args() 2024-02-21 17:55:37 +01:00
Michael Vogt
119172e8dd test: add sources_module fixture for sources unit tests
Similar to the `stage_module` fixture for stage tests this adds
a fixture to test sources modules of osbuild.

The code from `stage_module` and `sources_module` is similar and
could be combined but pytest makes it hard to do this without
having a shared root dir. Given that it's just four lines it
seems easier to just life with the tiny bit of code duplication.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
6572b1b8e7 util: remove storage_conf arg from get_host_storage()
Let the caller decide if a reload of the storage configuration is needed
and simplify the storage configuration reader.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
ac45c292e4 sources/containers-storage: call exists() when fetch()ing
Implement fetch_all() and fetch_one() as calls to exists() to make sure
we check that the containers are available every time they are needed.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
45510aeb64 sources: new source: containers-storage
This source checks for the existence of a local container in the host's
containers-storage. The source first reads the host's
`/etc/containers/storage.conf` file for the storage config and then
checks if the user has imported the desired container into the local
store.

Unlike the org.osbuild.containers stource, the
org.osbuild.containers-storage source doesn't need any extra data other
than the image ID.  The ID is all that is used to retrieve the
container.  The location and other information regarding the storage are
read from the host configuration and are not encoded in the manifest
There's no need to use the name to resolve it like we do in other
sources because containers in the local storage can be directly
referenced by their image id (config digest).

Other data such as the name of the container will only be relevant in
the stage that will use the container as input.

The source items are objects instead of simple strings of checksums
because we might, in the future, want to add specific options for each
source.

The content_type for this source is `containers-storage`, which defines
the location in the store where the source will bind mount the host's
container storage for stages to read.  We make this different from the
containers content because it will be treated differently enough to need
a separate input type.

Co-authored-by: Gianluca Zuccarelli <gzuccare@redhat.com>
Co-Authored-By: Michael Vogt <michael.vogt@gmail.com>
2024-02-21 17:55:37 +01:00
Simon de Vlieger
f9b55ff6a0 sources: rename download -> fetch_all
Not all sources download things and `fetch_all` is consistent with
`fetch_one`.
2024-01-26 09:58:48 +01:00
Simon de Vlieger
382ac8e960 sources(inline): remove threading
After some small benchmarks threading adds more overhead than
performance improvement for this source.
2024-01-26 09:58:48 +01:00
Simon de Vlieger
2c42c46c48 sources: move parallelisation into source
This moves the parallelisation decisions into the sources themselves,
making the `download` method abstract inside `osbuild` itself.
2024-01-26 09:58:48 +01:00
Gianluca Zuccarelli
a2b2a02add sources/skopeo: add storage location
For local images copied from an image store other than the default, we
need to be able to specify the `storage-location`. This commit enables
this functionality.

Jira: https://issues.redhat.com/browse/HMS-3235
2024-01-05 16:42:51 +01:00
Gianluca Zuccarelli
2ec88bdb32 sources/skopeo: check containers-storage
Update the skopeo sources stage to check other container-transports [1]
for downloading container images. This commit adds a transport for
`containers-storage` in addition to the existing `docker://`
transport.

Jira: https://issues.redhat.com/browse/HMS-3235

[1] CONTAINERS-TRANSPORTS(5)
2023-12-12 14:09:10 -08:00
Dusty Mabe
8844bc260e osbuild/util/ostree: create setup_remote function
This moves the setup_remote function from the ostree source into
util/ostree. This is prep for sharing this function with an mpp
helper in the future.
2023-10-16 20:26:10 +02:00
Dusty Mabe
f4ab2f43e2 sources/ostree: leverage util/ostree library code
Similar to the cleanups in 4e99e80, let's start using the library
code for the calls to ostree here.
2023-10-16 20:26:10 +02:00
Achilleas Koutsou
3a717e170a sources: add org.osbuild.skopeo-index source
A new source module that can download a multi-image manifest list from a
container registry.  This module is very similar to the skopeo source,
but instead downloads a manifest list with `--multi-arch=index-only`.
The checksum of the source object must be the digest of the manifest
list that will be stored and the manifest that is downloaded must be a
manifest-list.
2023-03-31 14:57:26 +02:00
Achilleas Koutsou
ce29a4af73 sources/skopeo: change local container format
Change the local storage format for containers to the `dir` format.
The `dir` format will be used to retain signatures and manifests.

The remove-signatures option is removed since the storage format now
supports them.

The final move (os.rename()) at the end of the fetch_one() method now
creates the checksum directory if it doesn't exist and moves the child
archive into it, adding to any existing archives that might exist in
other formats (from a previous version downloading a `docker-archive`).

Dropped the .tar suffix from the symlink in the skopeo stage since it's
not necessary and the target of the link might be a directory now.

The parent class exists() method checks if there is a *file* in the
sources cache that matches the checksum.  For containers, this used to
be a file called container-image.tar under a directory that matches the
checksum, so for containers it always returned False.  Added an override
for the skopeo source that checks for the new directory archive.
2023-03-31 14:57:26 +02:00
Achilleas Koutsou
be76d6f355 sources/skopeo: fix comment typo 2023-03-31 14:57:26 +02:00
Sanne Raymaekers
ae563ff896 sources/ostree: fix quotation marks in mTLS remote options
Example of broken repo config:
```
...
"tls-client-key-path=/etc/pki/consumer/key.pem"
"tls-client-cert-path=/etc/pki/consumer/cert.pem"
```
2023-01-13 11:35:43 +01:00
Sanne Raymaekers
925ca9b41e sources/ostree: set contenturl when pulling from remote
If a contenturl is specified, the url is used only for metadata. This is
useful when the actual content is hosted separately.
2022-10-14 12:04:54 +02:00
Sanne Raymaekers
cc9d05c201 sources/ostree: fix mTLS certs remote option
These options take the format --set "KEY_VALUE". Also fix the string
formatting.
2022-10-14 12:04:54 +02:00
Sanne Raymaekers
fcaad0462a sources/ostree: pull from remote using rhsm mTLS certs
The consumer certs are used to uniquely identify a system against
candlepin. These consumer certs can be used to identify the system when
pulling from RH controlled ostree repositories.
2022-10-11 16:49:45 +02:00
Simon de Vlieger
ea6085fae6 osbuild: run isort on all files 2022-09-12 13:32:51 +02:00
Achilleas Koutsou
f699720dbd sources/curl: quote URL paths before downloading
Some package versions [1] can contain carets and other characters that
curl doesn't like.  These need to be URL encoded.

Interestingly, the documented way of replacing components in a parsed
URL from urllib in Python is by calling the (seemingly private)
`_replace()` method [2].

[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Versioning/#_snapshots
[2] https://docs.python.org/3/library/urllib.parse.html#url-parsing
2022-08-31 22:28:54 +01:00
Christian Kellner
51315a985a stages/skopeo: use extra intermediate download dir
Instead of downloading the image directly to the temporary directory
and then moving that temporary directory into the cache use one more
intermediate directory and move that into the cache. The reason is
that on Python 3.6 removing the temporary directory itself will make
Python crash like this:

Python 3.6.8 (default, Sep  9 2021, 07:49:02)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> with tempfile.TemporaryDirectory(prefix="tmp-download-") as tmpdir:
...     import os
...     os.rename(tmpdir, "/tmp/foo")

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/usr/lib64/python3.6/tempfile.py", line 809, in __exit__
    self.cleanup()
  File "/usr/lib64/python3.6/tempfile.py", line 813, in cleanup
    _shutil.rmtree(self.name)
  File "/usr/lib64/python3.6/shutil.py", line 477, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib64/python3.6/shutil.py", line 475, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp-download-adl86mwa'
2022-07-19 19:52:25 +02:00
Christian Kellner
4647140808 source/skopeo: use subprocess.check_output
Use `subprocess.check_output` instead of `run(..., capture_output=True)`
since the latter only got added in Python 3.7 and our codebase needs to
be compatible with 3.6 due to RHEL 8.x.
2022-07-13 20:06:42 +02:00
Simon de Vlieger
3fd864e5a9 osbuild: fix optional-types
Optional types were provided in places but were not always correct. Add
mypy checking and fix those that fail(ed).
2022-07-13 17:31:37 +02:00
Achilleas Koutsou
c8073b5836 sources: support calling curl with --insecure
Add support for the `--insecure` curl flag, which makes curl skip the
verification step when making secure connections (e.g., https://).
This allows osbuild to download files from servers configured with
SSL/TLS but whose certificate cannot be validated.

This is supported for configuring repository sources in
osbuild-composer.
2022-06-14 22:13:39 +02:00
Simon de Vlieger
7b0e1fe5fd sources: curl max_workers 2 * num_cpus
This changes the curl source to use the number of cpus times two
for its thread count. A conservative number but a commonly used
default.
2022-05-24 19:45:23 +02:00
Thomas Lavocat
ac2a194cd4 sources: check if ostree object exists in cache
The generic ways of checking if an object is in the cache does not apply
for ostree as the internal structure of a repo is quite specific. Thus
we need to use the ostree executable to ask it to explore its repo for
us.
2022-05-11 04:32:42 -05:00
Thomas Lavocat
1de74ce2c9 sources: generalizing download method
Before, the download method was defined in the inherited class of each
program. With the same kind of workflow redefined every time. This
contribution aims at making the workflow more clear and to generalize
what can be in the SourceService class.

The download worklow is as follow:
Setup -> Filter -> Prepare -> Download

The setup mainly step sets up caches. Where the download data will be
stored in the end.

The filter step is used to discard some of the items to download based
on some criterion. By default, it is used to verify if an item is
already in the cache using the item's checksum.

The Prepare step goes from each element and let the overloading step the
ability to alter each item before downloading it. This is used mainly
for the curl command which for rhel must generate the subscriptions.

Then the download step will call fetch_one for each item. Here the
download can be performed sequentially or in parallel depending on the
number of workers selected.
2022-05-11 04:32:42 -05:00
Thomas Lavocat
0953cf64e0 sources: provide an unverified tmpdir
Some downloading program need a global unverified tmpdir to work within
before storing the definitive data. Provide this in the workflow
directly.
2022-05-11 04:32:42 -05:00
Thomas Lavocat
128845da3c sources: tidy the download method
Only the "items to download" need to be passed as parameters. The rest
is unpacked as attributes during the Setup step of the workflow.
2022-05-11 04:32:42 -05:00
Thomas Lavocat
92fe237f24 sources: introduce per-source content_type
Introduce a new class member `content_type` that specifies what type of
items the source will store in the cache. Use that to generalize the
setup step, which is shared across all sources.
2022-05-11 04:32:42 -05:00
Thomas Lavocat
34cd9ef9f0 sources: generalize cache generation
Introduce a `setup` step in the workflow that is responsible of
generating the cache folder. This is then used in each download method.
2022-05-11 04:32:42 -05:00
Tom Gundersen
e175529f7c sources/curl: don't limit total download time
Some RPMs might be very large, and limiting the total download time
might lead to failed build even in cases where downloading is making
progress. Instead, set a minimum download speed (1kbps). If the
minimum is not surpassed for 30 seconds in a row, the download fails
and is retried. This follows the logic employed by DNF.

Adjust the number of retries to 10 and the connection timeout to 30,
in order to match what DNF does. One difference is that DNF does 10
retries across all downloads, whereas we do it per download, this
could be changed in a follow-up.

Old:
 - a download taking more than 5 minutes is unconditionally aborted

New:
 - slow but working downloads will never be aborted
 - downloads will be stalled for at most five minutes
   in total before being aborted
 - time spent making progress does not count towards
   the five minutes

Signed-off-by: Tom Gundersen <teg@jklm.no>
2022-03-16 14:48:03 +01:00
Alexander Larsson
46a228df38 Add support for installing containers in images
This adds a stage called org.osbuild.skopeo that installs docker and
oci archive files into the container storage of the tree being
constructed.

The source can either be a file from another pipeline, for example one
created with the existing org.osbuild.oci-archive stage, or it can
be using the new org.osbuild.skopeo source and org.osbuild.containers
input, which will download an image from a registry and install that.

There is an optional option in the install stage that lets you
configure a custom storage location, which allows the use of the
additionalimagestores option in the container storage.conf
to use a read-only image stores (instead of /var/lib/container).

Note: skopeo fails to start if /etc/containers/policy.json is
not available, so we bind mount it from the build tree to the
buildroot if available.
2022-02-10 14:43:17 +01:00
Christian Kellner
c902a7a754 sources: port to host services
Port sources to also use the host services infrastructure that is
used by inputs, devices and mounts. Sources are a bit different
from the other services that they don't run for the duration of
the stage but are run before anything is built. By using the same
infrastructure we re-use the process management and inter process
communcation. Additionally, this will forward all messages from
sources to the existing monitoring framework.
Adapt all existing sources and tests.
2021-09-22 00:00:20 +02:00
Alexander Larsson
072b75d78e org.osbuild.curl: Don't load secrets if not needed
This moves the check for already downloaded files earlier so
that if all files are already downloaded we don't need to
load the secrets.

This is faster, but also it allows a pre-seeded object store
to run the manifest on a system (like a VM) that isn't subscribed.
2021-09-22 00:00:20 +02:00
Christian Kellner
7576191c2d sources/inline: fix schema
The top-level node "items" was not defined and the required property
"encoding" was wrongly called "method".
2021-06-30 12:06:30 +02:00
Martin Sehnoutka
ee3760e1ba sources/curl: Implement new way of getting RHSM secrets
The previous version covered too few use cases, more specifically a
single subscription. That is of course not the case for many hosts, so
osbuild needs to understand subscriptions.

When running org.osbuild.curl source, read the
/etc/yum.repos.d/redhat.repo file and load the system subscriptions from
there. While processing each url, guess which subscription is tied to
the url and use the CA certificate, client certificate, and client key
associated with this subscription. It must be done this way because the
depsolving and fetching of RPMs may be performed on different hosts and
the subscription credentials are different in such case.

More detailed description of why this approach was chosen is available
in osbuild-composer git: https://github.com/osbuild/osbuild-composer/pull/1405
2021-06-04 18:23:05 +01:00
Christian Kellner
2025184325 sources: introduce org.osbuild.inline
Add a new source for transporting binary data within the source
entry itself. The data is ascii encoded in the `data` property
of the inline source item, with the encoding that is used being
specified in the `encoding` property.
2021-05-12 14:26:16 +02:00
Christian Kellner
3ebfc6f657 sources/curl: use util.checksum.verify_file
Now that there is a common utility function to verify the checksum
of a file, use that.
Also fix the json schema entry for the property to have to correct
minium and maximum digest length, given the supported algorithm,
which is 32 (md5) and 128 (sha512) characters.
2021-05-12 14:26:16 +02:00