Commit graph

13 commits

Author SHA1 Message Date
Tom Gundersen
82f4d1cc96 sources/files: reduce the concurrent curl processes
We appear to be throttled by some mirrors if we are too eager. Back off.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-10 14:42:10 +02:00
Tom Gundersen
cf8216aea9 sources/files: don't spam stderr with error messages
Silence the errors, but include instead the error code in the returned
error message.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-07 22:08:34 +02:00
Tom Gundersen
d8e0469516 sources/files: don't retry curl on the same URL
The retry logic was meant to work around issues where a round-robin
redirect of mirrors gave us random mirrors of varying quality. This was
not used in practice, rather fixed mirrors were always used (either
hard-coded as basurl, or resolved from metalink).

The retry logic meant that when we did hit very slow mirrors we would
time-out and retry, potentially failing altogether, even though the data
was coming. Each retry would not help, as the mirror was anyway the
same. As a result our CI gave us avoidable false negative test results
some of the time.

The proper solution to this is to gain support for librepo and metalinks
to adopt the same retry logic that dnf uses.

For now, improve on the retry logic by retrying until a max total time,
rather than an increasing timeout on each try. Up the given timeouts to
be one minute to connect and five minutes to complete the download. This
avoids hanging forever if the mirror is truly broken, but still gives
more time to finish the download than each iteration in the old code
did.

There are no new tests for this, as before this change the tests mostly
passed, and after it they will hopefully still mostly pass (but more
often).

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-07 22:08:34 +02:00
Christian Kellner
42ef470740 sources/files: add documentation and schema
Add a brief documentation text and its JSON schema so that osbuild
can verify options org.osbuild.files source entries.
2020-06-02 09:50:14 +02:00
David Rheinsberg
707ff8c988 sources: keep try-except block small
We used to have a try-except block to catch URL requests that are not in
`urls`. This block has since then grown way bigger than it should be. We
may now accidentally catch KeyError exceptions from lots of other
places.

This commit extracts the accessor of `urls[checksum]` and saves the
result in a local variable and makes the remainder use that variable.
2020-05-28 11:06:05 +02:00
David Rheinsberg
c337af6795 sources: fix indentation
Fix indentation to make pylint happy.
2020-05-28 11:06:05 +02:00
David Rheinsberg
84dcadc7d2 sources: convert f-string to normal string
Convert an f-string to a normal string, since we do not use any format
specifier in it.
2020-05-28 11:06:05 +02:00
Jacob Kozol
9cbedc0496 sources: fix break when secrets is None
When the urls' secrets field is not set, an error is thrown when trying
to get the name of the secrets. The secrets now have a default value of
{} when they are checked for the name.
2020-05-24 11:08:05 +02:00
Jacob Kozol
372b1174f2 sources: add rhsm secret support to files
When osbuild is given a manifest, the sources' urls can contain fields
for both a url path and a secret for that url. If the secret is
org.osbuild.rhsm the system's rhsm certificates are retrieved. These
certs are included when the files are curled.
2020-05-20 18:52:35 +02:00
Jacob Kozol
2309b54eb3 sources: reduce whitespace in files cp command 2020-05-20 18:52:35 +02:00
Tom Gundersen
c8465ce06f sources/files: time-out curl
Add a 10s connection timeout for each file transfer. Also add an
increasing max timeout for a given file transfer (30s to 180s).

Also increase the retries to 10 and the concurrent threads to 15.

Hopefully this should make things a bit more stable in the face of
bad mirrors. We were encountering mirrors that would hang either
on connect or download at such slow speeds that they might as well
have stalled (~1kB in 45s).

Follow-up patches will provide a more long-term solution, by
allowing the same mirror selection as dnf currently uses.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-15 17:07:01 +01:00
Tom Gundersen
be0ff68411 fixup: files 2020-02-06 19:01:12 +01:00
Tom Gundersen
7817ae5e8b sources: add org.osbuild.files source
This source adds support for downloaded files. The files are
indexed by their content hash, and the only option is their URL.

The main usecase for this will be downloading rpms. Allowing depsolving
to be done outside of osbuild, network access to be restricted and
downloaded rpms to be reused between runs.

Each source is now passed two additional arguments, a cache directory
and an output directory. Both are in the source's namespace, and
the source is responsible for managing them. Each directory may
contain contents from previous runs, but neither is ever guaranteed
to do so.

Downloaded contents may be saved to the cache and resued between
runs, and the requested content should be written to the output dir.
If secrets are used, the source must only ever write contents to
the output that corresponds to the available secrets (rather than
contents from the cache from previous runs).

Each stage is passed an additional argument, a sources directory.
The directory is read-only, and contains a subdirectory named after
each used source, which will contain the requseted contents when
the `Get()` call returns (if the source uses this functionality).

Based on a patch by Lars Karlitski.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-06 19:01:12 +01:00