Using a metalink or mirrorlist along with the package paths and
checksums allows them to be reliably downloaded even when mirrors are
not all in sync. It will retry with a new mirror until it succeeds, or
has tried all of the mirrors.
Not all distros ship `/var/empty` so just create an empty dir
on demand as needed.
This also tweaks `test_ostree_source_exists()` into calling
`make_repo()` instead of duplicating that code.
This commit keeps track of individual errors as curl will only report
the last download operation success/failure via it's exit code.
There is "--fail-early" but the downside of that is that abort all in
progress downloads too.
curl keeps a global parser state. This means that if there are
multiple "cacert =" values they are just overriden and the last
one wins. This is why the `test_curl_download_many_mixed_certs`
test did not work - the second `cacert = ` overwrites the previous
one.
To fix this we need to use `--next` when we need to change options
on a per url (like `cacert`) basis. With `--next` curl starts a
new parser state for the next url (but keeps the options for the
previous ones set). This commit does that in a slightly naive
way by just repeating our options for each url. Technically
we could sort the sources so that we have less repetition but
other then slightly smaller auto-generated files it has no
advantage.
With this commit the `test_curl_download_many_mixed_certs` test
works.
When investigating https://github.com/osbuild/osbuild-composer/pull/4247
we found that it would fail when a download required two sets of
`--cacert` keys. This commits adds a test for this that fails on
the centos9 7.76.1 version.
storage_conf was never not None, so the loading was called every time.
This never crashed because conf was always being set, but it wasn't
working properly regardless.
Add a new util module called host which is used for functions that are
meant for interactions with the host. These functions should not be
used in stages.
The containers.get_host_storage() function is renamed to
host.get_container_storage() for clarity, since it is no longer
namespaced under containers.
When using `--write-out` we are not using %{json} because older curl
(7.76) will write {"http_connect":000} which python cannot parse.
So we had a custom `--write-out` with `\1xc` as "record" separators
between the fields. This is a bit old-school and not very extensible
so Achilleas had the idea to still use json but "define" our own
subset via the variables that curl provides. This commit does that.
As part of the investigation of the CI failure in
https://github.com/osbuild/osbuild-composer/pull/4247
we noticed that curl can return a return_code of `0` even
when it did not downloaded all the urls in a `--config` provided
file. This seems to be curl version dependent, I had a hard
time writing a test-case with the real curl (8.6.0) that
reproduces this so I went with mocking it. We definietly saw
this failure with the centos 9 version (7.76).
Our current code is buggy and assumes that the exit status
of curl is always non-zero if any download fails but that is
only the case when `--fail-early` is used.
The extra paranoia will not hurt even when relying on the
exit code of curl is fixed.
Disable `curl --parallel` by default until the failure in
https://github.com/osbuild/osbuild-composer/pull/4247
is fully understood. It can be enabled via the environment:
```
OSBUILD_SOURCES_CURL_USE_PARALLEL=1
```
in the osbuild-composer test.
When using a modern curl we can download download multiple urls
in parallel which avoids connection setup overhead and is generally
more efficient. Use when it's detected.
TODO: ensure both old and new curl are tested automatically via
the testsuite.
Modern curl (7.68+) has a --parallel option that will download
multiple sources in parallel. This commit adds detection for this
feature as it is only available after RHEL 8.
In addition we need some more feature to properly support --parallel,
i.e. `--write-out` with json and exitcode options. This bumps the
requirements to 7.75+ which is still fine, centos9/RHEL9 have
7.76.
Instead of passing the url and options on the commandline this
commit moves it into a config file. This is not useful just yet
but it will be once we download multiple urls per curl instance.
Setting the user-agent using `--header` is broken in combination with
`--location`, `--proxy`, and an https endpoint which redirects. The
user-agent sent to the proxy changes after the client is redirected,
tripping up proxies.
For more information see https://issues.redhat.com/browse/RHEL-45364
Currently, osbuild downloads are identified as coming from `curl`. This
is unfortunate because some RPM mirrors block requests from curl. Let's
"fix" that by introducing our own user-agent. While this can certainly
be seen as "circumventing" a policy, I think that this change is
actually helpful: Now, the mirror maintainers can actually distinguish
osbuild requests from regular curl calls. If they want to block osbuild,
they certainly can, we have no power there, but at least this allows
more fine-grained filtering. Also, our new user-agent contains our
domain name, so if there's a problem, they can contact us.
Only py3.7+ has ThreadingHTTPServer and SimpleHTTPRequestHandler
that can take a directory argument. We could reimplement this
on py36 (easy for threading, harder for missing directory) but
instead this commit just skips tests that try to use a
ThreadingHTTPServer.
Remove once we no longer support py3.6.
Tiny tweak to remove some boilerplate related to tmpfile handling.
The pytest `tmp_path` fixture gives us the tmpdir without having
to worry about cleanup etc (and in a slightly more concise way).
The `amend_secrets()` does not work with real files so there is
no need to mock cachedirs or create fake input files. This commit
just removes those.
It also changes the checksum to `"1"*64` to make it very clear
that the checksum has no significance in this test.
There was a regression with the secrets adding of rhsm for the
curl source. This was my mistake (sorry!). Here is a regression
test that would have prevented this (if we have had it earlier).
Similar to the previous commit to include a `inputs_service` fixture
this does the same for `source.SourcesService` imports.
Note that we cannot easily share the helpers so we have to life with
a bit of very similar but duplicated code. To fix this we would have
to have a shared confftest.py that pytest can find. Which would mean
that we need to put the tests under a common dir that is reachable
via __init__.py files (which we currently not have because stages,
inputs etc do not have a __init__.py so python does not considers
them modules).
The curl source is the only source left that uses "transform". And
here the name is very generic but in fact we only do a single thing:
we add secrets for subscriptions for for mtls to the download.
So rename to make it clear what this is all about.
The current `make_container()` helper is a bit silly (which is
entirely my fault). It requires a container tag as input but all
tests end up creating a random number for this input. So instead
just remove the input and return the container_id from the podman
build in the contextmanager and use that.
This commit is somewhat poor, sorry for that. It mostly adds
workaround so that the osbuild sources can emit some progress
reporting as well. Without that the user experience is rather poor
and there is a long delay before any sort of progress can be
reported (even before the normal stages run).
With it the user experience is still not good but slightly better,
i.e. the progress monitor will report that the sources have
started downloading and curl will generated some log output. No
real progress unfortunately (sources subprogress will jump from
zero to 100%).
When an images does not exist just return `False` instead of
raising a RuntimeError. If anything else goes wrong (unknown
output or hash mismatch) keep the RuntimeError as this is an
unexpected exception.
The source implementation used `subprocess.run()` argument
`capture_output`, which was added in Python 3.7. Since the minimum
supported Python version for osbuild on RHEL-8 is 3.6, the stage fails
with TypeError.
Example failure: https://artifacts.dev.testing-farm.io/c147b608-c40e-46ed-bf11-6b15ecf718dc/
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Similar to the `stage_module` fixture for stage tests this adds
a fixture to test sources modules of osbuild.
The code from `stage_module` and `sources_module` is similar and
could be combined but pytest makes it hard to do this without
having a shared root dir. Given that it's just four lines it
seems easier to just life with the tiny bit of code duplication.
This source checks for the existence of a local container in the host's
containers-storage. The source first reads the host's
`/etc/containers/storage.conf` file for the storage config and then
checks if the user has imported the desired container into the local
store.
Unlike the org.osbuild.containers stource, the
org.osbuild.containers-storage source doesn't need any extra data other
than the image ID. The ID is all that is used to retrieve the
container. The location and other information regarding the storage are
read from the host configuration and are not encoded in the manifest
There's no need to use the name to resolve it like we do in other
sources because containers in the local storage can be directly
referenced by their image id (config digest).
Other data such as the name of the container will only be relevant in
the stage that will use the container as input.
The source items are objects instead of simple strings of checksums
because we might, in the future, want to add specific options for each
source.
The content_type for this source is `containers-storage`, which defines
the location in the store where the source will bind mount the host's
container storage for stages to read. We make this different from the
containers content because it will be treated differently enough to need
a separate input type.
Co-authored-by: Gianluca Zuccarelli <gzuccare@redhat.com>
Co-Authored-By: Michael Vogt <michael.vogt@gmail.com>