Commit graph

124 commits

Author SHA1 Message Date
Brian C. Lane
28e74f6c9b Add support for using librepo to download packages
Using a metalink or mirrorlist along with the package paths and
checksums allows them to be reliably downloaded even when mirrors are
not all in sync. It will retry with a new mirror until it succeeds, or
has tried all of the mirrors.
2025-01-14 08:19:16 +01:00
Michael Vogt
add78e7f47 sources: skip ostree tests if no ostree binary if found
This commit skips the ostree tests if no ostree binary is available.
2024-11-28 20:06:51 +01:00
Michael Vogt
2f892b20e7 sources: fix ostree_sources test to work without /var/empty
Not all distros ship `/var/empty` so just create an empty dir
on demand as needed.

This also tweaks `test_ostree_source_exists()` into calling
`make_repo()` instead of duplicating that code.
2024-11-26 10:26:52 +01:00
Lukas Zapletal
32b1b91597 test: regenerate X509 test certs 2024-11-22 10:15:50 +01:00
Lukas Zapletal
ef24311f77 sources: MTLS and proxy support for ostree 2024-11-04 16:35:53 +01:00
Lukas Zapletal
f9873e493e sources: MTLS and proxy support for ostree 2024-10-22 22:16:35 +02:00
Michael Vogt
b4b865fddf Revert "sources(curl): disable curl --parallel by default"
This reverts commit 9bef57d5a6.
2024-08-28 07:46:37 +02:00
Michael Vogt
a229d46b1e sources(curl): manually keep track of failed URLs
This commit keeps track of individual errors as curl will only report
the last download operation success/failure via it's exit code.

There is "--fail-early" but the downside of that is that abort all in
progress downloads too.
2024-08-28 07:46:37 +02:00
Michael Vogt
a50dbb14c2 sources(curl): use --next for each url in curl config
curl keeps a global parser state. This means that if there are
multiple "cacert =" values they are just overriden and the last
one wins. This is why the `test_curl_download_many_mixed_certs`
test did not work - the second `cacert = ` overwrites the previous
one.

To fix this we need to use `--next` when we need to change options
on a per url (like `cacert`) basis. With `--next` curl starts a
new parser state for the next url (but keeps the options for the
previous ones set). This commit does that in a slightly naive
way by just repeating our options for each url. Technically
we could sort the sources so that we have less repetition but
other then slightly smaller auto-generated files it has no
advantage.

With this commit the `test_curl_download_many_mixed_certs` test
works.
2024-08-28 07:46:37 +02:00
Michael Vogt
6ccd5d5cfe test: add test that download mixed https content
When investigating https://github.com/osbuild/osbuild-composer/pull/4247
we found that it would fail when a download required two sets of
`--cacert` keys. This commits adds a test for this that fails on
the centos9 7.76.1 version.
2024-08-28 07:46:37 +02:00
Achilleas Koutsou
6ba0561187 sources/containers-storage: fix load caching
storage_conf was never not None, so the loading was called every time.
This never crashed because conf was always being set, but it wasn't
working properly regardless.
2024-08-22 19:47:28 +02:00
Achilleas Koutsou
07a597481b util: move get_host_storage() to a separate module
Add a new util module called host which is used for functions that are
meant for interactions with the host.  These functions should not be
used in stages.

The containers.get_host_storage() function is renamed to
host.get_container_storage() for clarity, since it is no longer
namespaced under containers.
2024-08-21 19:26:31 +02:00
Michael Vogt
46db834dee sources(curl): use json like output inside of custom record
When using `--write-out` we are not using %{json} because older curl
(7.76) will write {"http_connect":000} which python cannot parse.

So we had a custom `--write-out` with `\1xc` as "record" separators
between the fields. This is a bit old-school and not very extensible
so Achilleas had the idea to still use json but "define" our own
subset via the variables that curl provides. This commit does that.
2024-07-30 11:12:03 +02:00
Michael Vogt
16667ef260 sources(curl): error if curl exists 0 but there are downloads left
As part of the investigation of the CI failure in
https://github.com/osbuild/osbuild-composer/pull/4247
we noticed that curl can return a return_code of `0` even
when it did not downloaded all the urls in a `--config` provided
file. This seems to be curl version dependent, I had a hard
time writing a test-case with the real curl (8.6.0) that
reproduces this so I went with mocking it. We definietly saw
this failure with the centos 9 version (7.76).

Our current code is buggy and assumes that the exit status
of curl is always non-zero if any download fails but that is
only the case when `--fail-early` is used.

The extra paranoia will not hurt even when relying on the
exit code of curl is fixed.
2024-07-17 11:39:35 +02:00
Michael Vogt
9bef57d5a6 sources(curl): disable curl --parallel by default
Disable `curl --parallel` by default until the failure in
https://github.com/osbuild/osbuild-composer/pull/4247

is fully understood. It can be enabled via the environment:
```
OSBUILD_SOURCES_CURL_USE_PARALLEL=1
```
in the osbuild-composer test.
2024-07-08 18:00:59 +02:00
Michael Vogt
4697a3fb84 sources: do not use %{json} when generating curl output
We cannot use `curl --write-out %{json}` because older curl
(7.76 from RHEL9/Centos9) will write `{"http_connect":000}`
which python cannot parse.
2024-07-04 11:53:40 +02:00
Michael Vogt
018c15aae8 sources: run all tests for curl with both old and new curl
To ensure there are no regressions with the old curl make
sure to run all tests that fetch_all() with both old and
new curl.
2024-07-04 11:53:40 +02:00
Michael Vogt
0d3a153c78 sources: add new _fetch_all_new_curl() helper
When using a modern curl we can download download multiple urls
in parallel which avoids connection setup overhead and is generally
more efficient. Use when it's detected.

TODO: ensure both old and new curl are tested automatically via
the testsuite.
2024-07-04 11:53:40 +02:00
Michael Vogt
974c8adff9 source: add helper to detect if curl parallel download is available
Modern curl (7.68+) has a --parallel option that will download
multiple sources in parallel. This commit adds detection for this
feature as it is only available after RHEL 8.

In addition we need some more feature to properly support --parallel,
i.e. `--write-out` with json and exitcode options. This bumps the
requirements to 7.75+ which is still fine, centos9/RHEL9 have
7.76.
2024-07-04 11:53:40 +02:00
Michael Vogt
d20713d7af curl: add gen_curl_download_config() and use in download
Instead of passing the url and options on the commandline this
commit moves it into a config file. This is not useful just yet
but it will be once we download multiple urls per curl instance.
2024-07-04 11:53:40 +02:00
Sanne Raymaekers
2e5a9335c9 sources/curl: use --user-agent option to set the user-agent
Setting the user-agent using `--header` is broken in combination with
`--location`, `--proxy`, and an https endpoint which redirects. The
user-agent sent to the proxy changes after the client is redirected,
tripping up proxies.

For more information see https://issues.redhat.com/browse/RHEL-45364
2024-07-02 16:15:56 +02:00
Ondřej Budai
af0e849081 sources/curl: Use our own User-Agent
Currently, osbuild downloads are identified as coming from `curl`. This
is unfortunate because some RPM mirrors block requests from curl. Let's
"fix" that by introducing our own user-agent. While this can certainly
be seen as "circumventing" a policy, I think that this change is
actually helpful: Now, the mirror maintainers can actually distinguish
osbuild requests from regular curl calls. If they want to block osbuild,
they certainly can, we have no power there, but at least this allows
more fine-grained filtering. Also, our new user-agent contains our
domain name, so if there's a problem, they can contact us.
2024-04-30 03:10:44 +02:00
Michael Vogt
2586a748fd testutil: skip tests for missing ThreadingHTTPServer in py36
Only py3.7+ has ThreadingHTTPServer and SimpleHTTPRequestHandler
that can take a directory argument. We could reimplement this
on py36 (easy for threading, harder for missing directory) but
instead this commit just skips tests that try to use a
ThreadingHTTPServer.

Remove once we no longer support py3.6.
2024-04-16 15:16:49 +02:00
Michael Vogt
dbe7039674 sources(curl): tweak tests to use monkeypatch.setenv()
Using pytests support for changing setenv() in tests makes things
a little bit more concise.
2024-04-10 16:13:12 -07:00
Michael Vogt
34ad069757 sources(curl): tweak tests to use monkeypatch.setenv()
Using pytests support for changing setenv() in tests makes things
a little bit more concise.
2024-04-09 03:03:38 +02:00
Michael Vogt
b9b296a7e5 testutil: add AtomicCounter() as a threadsafe counter
The existing code in the reqs counting is not really thread safe,
this commit fixes that.
2024-04-09 03:02:45 +02:00
Sanne Raymaekers
b90a5027dc sources(curl): set HTTP proxy through the environment 2024-04-08 11:56:05 +02:00
Michael Vogt
98f5904181 source: add curl test in preparation for #1573
When moving to parallel downloads in curl we will need more
comprehensive tests. This commits starts with building some
infrastructure for this.
2024-04-05 16:42:07 +02:00
Michael Vogt
79d788ac23 tests: use tmp_path fixture in test_curl_source.py
Tiny tweak to remove some boilerplate related to tmpfile handling.
The pytest `tmp_path` fixture gives us the tmpdir without having
to worry about cleanup etc (and in a slightly more concise way).
2024-04-03 15:06:07 +02:00
Michael Vogt
fb701d6db5 sources: simplify test_curl_source_amend_* tests a little bit
The `amend_secrets()` does not work with real files so there is
no need to mock cachedirs or create fake input files. This commit
just removes those.

It also changes the checksum to `"1"*64` to make it very clear
that the checksum has no significance in this test.
2024-04-03 15:06:07 +02:00
Michael Vogt
fe05b3084b sources: add regression test for issue##1693
There was a regression with the secrets adding of rhsm for the
curl source. This was my mistake (sorry!). Here is a regression
test that would have prevented this (if we have had it earlier).
2024-04-03 13:55:00 +02:00
Michael Vogt
1d4f2dc53b testutil: extract find_one_subclass_in_module() helper
A small refactor to avoid shipping this duplicated code (this
one is easy to extract/reuse).
2024-04-03 11:36:01 +02:00
Michael Vogt
79360b529a sources: add new sources_service fixture
Similar to the previous commit to include a `inputs_service` fixture
this does the same for `source.SourcesService` imports.

Note that we cannot easily share the helpers so we have to life with
a bit of very similar but duplicated code. To fix this we would have
to have a shared confftest.py that pytest can find. Which would mean
that we need to put the tests under a common dir that is reachable
via __init__.py files (which we currently not have because stages,
inputs etc do not have a __init__.py so python does not considers
them modules).
2024-04-03 11:36:01 +02:00
Michael Vogt
5f31ccf9f2 test: add/use new testutil.make_fake_service_fd()
All inputs/sources tests need a fake service fd to instanciate
their services. Consolidate the creation in a single helper.
2024-04-03 11:36:01 +02:00
Andre Marianiello
7e0e30fd8f curl: fix RHSM url retrieval 2024-03-29 13:02:11 +01:00
Michael Vogt
352bf5cd52 curl: rename "transform" to "amend_secrets"
The curl source is the only source left that uses "transform". And
here the name is very generic but in fact we only do a single thing:
we add secrets for subscriptions for for mtls to the download.

So rename to make it clear what this is all about.
2024-03-19 14:21:57 +01:00
Michael Vogt
1fc7ead2f4 sources: transform() is only used in the curl sources, remove from ABC 2024-03-19 14:21:57 +01:00
Michael Vogt
fd0167f130 test: return container_id in make_container
The current `make_container()` helper is a bit silly (which is
entirely my fault). It requires a container tag as input but all
tests end up creating a random number for this input. So instead
just remove the input and return the container_id from the podman
build in the contextmanager and use that.
2024-03-18 20:36:19 +01:00
Michael Vogt
f214c69a98 osbuild: add workaround to integrate sources into progress reporting
This commit is somewhat poor, sorry for that. It mostly adds
workaround so that the osbuild sources can emit some progress
reporting as well. Without that the user experience is rather poor
and there is a long delay before any sort of progress can be
reported (even before the normal stages run).

With it the user experience is still not good but slightly better,
i.e. the progress monitor will report that the sources have
started downloading and curl will generated some log output. No
real progress unfortunately (sources subprogress will jump from
zero to 100%).
2024-03-12 16:44:12 +01:00
Sanne Raymaekers
29159189f1 sources/curl: add org.osbuild.mtls secrets support
If `org.osbuild.mtls` is passed as a secret name, look for the mtls data
in the environment.
2024-03-11 11:09:37 +01:00
Michael Vogt
82f2414637 sources: tweak ContainersStorageSources.exists to return False
When an images does not exist just return `False` instead of
raising a RuntimeError. If anything else goes wrong (unknown
output or hash mismatch) keep the RuntimeError as this is an
unexpected exception.
2024-02-27 15:07:42 +01:00
Michael Vogt
5ab0b41456 sources: add test for non-existing id in containers-storage 2024-02-27 15:07:42 +01:00
Michael Vogt
0ac05ecb55 sources: tweak docstring for containers-storage 2024-02-27 15:07:42 +01:00
Tomáš Hozza
2b868fbe75 Sources/containers-storage: make the code Python 3.6 compliant
The source implementation used `subprocess.run()` argument
`capture_output`, which was added in Python 3.7. Since the minimum
supported Python version for osbuild on RHEL-8 is 3.6, the stage fails
with TypeError.

Example failure: https://artifacts.dev.testing-farm.io/c147b608-c40e-46ed-bf11-6b15ecf718dc/

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-02-25 09:27:23 +01:00
Achilleas Koutsou
591593ea00 testutil: make_container context manager
Make make_container a context manager so we can reliably clean up
containers that were created in tests.
2024-02-21 17:55:37 +01:00
Michael Vogt
d6cd4b93ba test: add minimal test for ContainersStorageSource.from_args() 2024-02-21 17:55:37 +01:00
Michael Vogt
119172e8dd test: add sources_module fixture for sources unit tests
Similar to the `stage_module` fixture for stage tests this adds
a fixture to test sources modules of osbuild.

The code from `stage_module` and `sources_module` is similar and
could be combined but pytest makes it hard to do this without
having a shared root dir. Given that it's just four lines it
seems easier to just life with the tiny bit of code duplication.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
6572b1b8e7 util: remove storage_conf arg from get_host_storage()
Let the caller decide if a reload of the storage configuration is needed
and simplify the storage configuration reader.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
ac45c292e4 sources/containers-storage: call exists() when fetch()ing
Implement fetch_all() and fetch_one() as calls to exists() to make sure
we check that the containers are available every time they are needed.
2024-02-21 17:55:37 +01:00
Achilleas Koutsou
45510aeb64 sources: new source: containers-storage
This source checks for the existence of a local container in the host's
containers-storage. The source first reads the host's
`/etc/containers/storage.conf` file for the storage config and then
checks if the user has imported the desired container into the local
store.

Unlike the org.osbuild.containers stource, the
org.osbuild.containers-storage source doesn't need any extra data other
than the image ID.  The ID is all that is used to retrieve the
container.  The location and other information regarding the storage are
read from the host configuration and are not encoded in the manifest
There's no need to use the name to resolve it like we do in other
sources because containers in the local storage can be directly
referenced by their image id (config digest).

Other data such as the name of the container will only be relevant in
the stage that will use the container as input.

The source items are objects instead of simple strings of checksums
because we might, in the future, want to add specific options for each
source.

The content_type for this source is `containers-storage`, which defines
the location in the store where the source will bind mount the host's
container storage for stages to read.  We make this different from the
containers content because it will be treated differently enough to need
a separate input type.

Co-authored-by: Gianluca Zuccarelli <gzuccare@redhat.com>
Co-Authored-By: Michael Vogt <michael.vogt@gmail.com>
2024-02-21 17:55:37 +01:00