Commit graph

45 commits

Author SHA1 Message Date
Christian Kellner
c902a7a754 sources: port to host services
Port sources to also use the host services infrastructure that is
used by inputs, devices and mounts. Sources are a bit different
from the other services that they don't run for the duration of
the stage but are run before anything is built. By using the same
infrastructure we re-use the process management and inter process
communcation. Additionally, this will forward all messages from
sources to the existing monitoring framework.
Adapt all existing sources and tests.
2021-09-22 00:00:20 +02:00
Alexander Larsson
072b75d78e org.osbuild.curl: Don't load secrets if not needed
This moves the check for already downloaded files earlier so
that if all files are already downloaded we don't need to
load the secrets.

This is faster, but also it allows a pre-seeded object store
to run the manifest on a system (like a VM) that isn't subscribed.
2021-09-22 00:00:20 +02:00
Christian Kellner
7576191c2d sources/inline: fix schema
The top-level node "items" was not defined and the required property
"encoding" was wrongly called "method".
2021-06-30 12:06:30 +02:00
Martin Sehnoutka
ee3760e1ba sources/curl: Implement new way of getting RHSM secrets
The previous version covered too few use cases, more specifically a
single subscription. That is of course not the case for many hosts, so
osbuild needs to understand subscriptions.

When running org.osbuild.curl source, read the
/etc/yum.repos.d/redhat.repo file and load the system subscriptions from
there. While processing each url, guess which subscription is tied to
the url and use the CA certificate, client certificate, and client key
associated with this subscription. It must be done this way because the
depsolving and fetching of RPMs may be performed on different hosts and
the subscription credentials are different in such case.

More detailed description of why this approach was chosen is available
in osbuild-composer git: https://github.com/osbuild/osbuild-composer/pull/1405
2021-06-04 18:23:05 +01:00
Christian Kellner
2025184325 sources: introduce org.osbuild.inline
Add a new source for transporting binary data within the source
entry itself. The data is ascii encoded in the `data` property
of the inline source item, with the encoding that is used being
specified in the `encoding` property.
2021-05-12 14:26:16 +02:00
Christian Kellner
3ebfc6f657 sources/curl: use util.checksum.verify_file
Now that there is a common utility function to verify the checksum
of a file, use that.
Also fix the json schema entry for the property to have to correct
minium and maximum digest length, given the supported algorithm,
which is 32 (md5) and 128 (sha512) characters.
2021-05-12 14:26:16 +02:00
Christian Kellner
a05a8aaed6 sources/ostree: remove export functionality
Since the `sources.SourcesServer` has been removed, nothing is
using the export functionality anymore. Inputs are now used to
make content in the store available to stages. Remove all the
export logic from org.osbuild.ostree.
2021-04-29 12:58:01 +02:00
Christian Kellner
518940cfe0 sources/curls: refactor downloading code
Now that the `export` functionality is gone, the download code
can be simplified, since we are not downloading a subset of the
urls, but all of them.
2021-04-29 12:58:01 +02:00
Christian Kellner
5c19360cbe sources/curl: remove export functionality
Since the `sources.SourcesServer` has been removed, nothing is
using the export functionality anymore. Inputs are now used to
make content in the store available to stages. Remove all the
export logic from org.osbuild.curl.
2021-04-29 12:58:01 +02:00
Christian Kellner
2dcc1d9cee sources/ostree: capture ostree output
Instead of using stderr for the ostree subprocess command
capture its output so that in the case of an error we get
properly return the error output. With the old behavior
all the `ostree` command output would land in the journal
of the worker.
2021-03-12 18:49:41 +01:00
Christian Kellner
b609bb81dd source/ostree: fix download only case
Source, for compatability reasons, have two modes: download only
and download and export. The difference is the arguments that
are passed to the source: For download only, the `output` param
is empty. In this case also `checksums` *can* be empty and if so
it means everything, i.e. the commits, should be fetched. The
latter was not properly handled so far. Adjust the logic, which
now closely mimics that of the `org.osbuild.curl` source to fix
this case.
Also catch exceptions invoking `ostree` and properly return them
via the json error messaging.
2021-03-12 18:49:41 +01:00
Christian Kellner
81c8374d3e sources: rename org.osbuild.{files -> curl}
The `org.osbuild.files` source provides files, but might in the
future not be the only one that does. Therefore rename it to
match the internal tool that is being used to fetch the files.
This is done for most other osbuild modules that target tools.

The format v1 loader is adapted to make this change transparent
for users of the v1 format, so we are backwards compatible.

Change the MPP depsolve preprocessor so that for format v2 based
manifest `org.osbuild.curl` source is used. Also rename the
corresponding source test. Adapt the format v2 mod test to use
the curl source.
2021-02-12 19:27:08 +01:00
Christian Kellner
fa9c288988 sources: source itself controls cache sub-dir
Instead of supplying the full cache dir, i.e. the directory in
the store where the source will place the fetched resources, to
the source, only supply the root folder of the cache and let
the source itself create the desired sub-directory. This allows
the source to determine what type of resource it provides. This
makes the final directory independent of the name of the source:
a `org.osbuild.curl` source can place file-like resource in the
`org.osbuild.files` sub-directory. Then the `org.osbuild.files`
input can be used to get those from the cache directory.
2021-02-12 19:27:08 +01:00
Christian Kellner
1d5d1fd44a sources/ostree: support format version 2
In format version 2, the source specific keys for the sources,
here "urls", is replaced by a generic `items` key, common to
all sources. Express that in the schema.
2021-02-12 15:55:43 +01:00
Christian Kellner
429f833434 sources/files: add format version 2 support
In format version 2, the source specific keys for the sources,
here "urls", is replaced by a generic `items` key, common to
all sources. Express that in the schema.
2021-02-12 15:55:43 +01:00
Christian Kellner
931eac23c3 sources: introduce source items
All sources fetch various types of `items`, the specific nature
of which is dependent on the source type, but they are all
identifyable by a opaque identifier. In order for osbuild to
check that all the inputs that a stage needs are are indeed
contained in the manifest description, osbuild must learn what
ids are fetched by what source. This is done by standarzing
the common "items" part, i.e. the "id" -> "options for that id"
mapping that is common to all sources.
For the version 1 of the format, extract the files and ostree
the item information from the respective options.
Adapt the sources (files, ostree) so that they use the new items
information, but also fall back to the old style; the latter is
needed since the sources tests still uses the SourceServer.
2021-02-10 15:44:24 +01:00
Christian Kellner
ee9df25a02 sources/ostree: ability to only pull commits
Split the internal logic into two parts: 1) fetching the commit
into the internal cache repo and then 2) exporting that commit,
i.e. a local pull from the cache repo to the output directory.
If no `output` directory was specified, only fetch the commit,
do not attempt to export it.
NB: this commit changes at what point the gpg verification is
done. Previously the check was on export. Now, we are checking
the signature on import only. The export step will be replaced
by an ostree `Input` that will have the ability to verify
commits a second time.
2021-02-06 12:04:30 +01:00
Christian Kellner
127be09ba8 sources/files: ability to only download files
Split the internal logic of the stage in two parts: 1) downloading
files to the internal cache and 2) exporting the downloaded files
from said cache to the output directory. Additionally, ff no such
`output` directory was specified, i.e. it is empty or `None`, only
download files but do not attempt to export them.
2021-02-06 12:04:30 +01:00
Christian Kellner
ee1d860755 sources: drop dnf stage
This source has been declared obsolete some time ago and is not
support anymore. We wont support it in the upcoming new manifest
format, therefore drop it now.
2021-01-22 17:17:54 +01:00
Christian Kellner
cbcb335b3e osbuild: fix spelling mistakes found by codespell
Run codespell on the source ('codespell -f -L msdos -S coverity
-S rpmbuild -S samples') and fix all uncovered mistakes.
2020-10-06 14:41:00 +02:00
Ondřej Budai
7b0db90c76 sources/files: do not pass floats to --max-time
curl uses strtod from the C standard library to convert the --max-time's value
from string to double. However, this is what strtod expects:

nonempty sequence of decimal digits optionally containing decimal-point
character (as determined by the current C locale)

Yeah, unfortunately, the decimal-point character is determined by the current
C locale. For example, Czech and German locale uses a comma as the
decimal-point character.

For reasons I don't fully understand, Python thinks it's running on en_US
locale, even though LC_NUMERIC is set to cs_CZ, so it uses a full stop as the
decimal-point character when converting float to string. However, as written
before, curl fails to parse this because it expects comma.

The fix I chose is simple: Use math.ceil, so only an integer can be passed to
curl. Why ceil? Because --max-time == 0 sounds fishy. math.ceil should return
an integer (and it does in Python 3.8) but the documentation is not 100% clear
on this topic, so let's be paranoid and also convert it to int after the
ceiling.
2020-06-25 21:25:17 +02:00
Tom Gundersen
82f4d1cc96 sources/files: reduce the concurrent curl processes
We appear to be throttled by some mirrors if we are too eager. Back off.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-10 14:42:10 +02:00
Tom Gundersen
cf8216aea9 sources/files: don't spam stderr with error messages
Silence the errors, but include instead the error code in the returned
error message.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-07 22:08:34 +02:00
Tom Gundersen
d8e0469516 sources/files: don't retry curl on the same URL
The retry logic was meant to work around issues where a round-robin
redirect of mirrors gave us random mirrors of varying quality. This was
not used in practice, rather fixed mirrors were always used (either
hard-coded as basurl, or resolved from metalink).

The retry logic meant that when we did hit very slow mirrors we would
time-out and retry, potentially failing altogether, even though the data
was coming. Each retry would not help, as the mirror was anyway the
same. As a result our CI gave us avoidable false negative test results
some of the time.

The proper solution to this is to gain support for librepo and metalinks
to adopt the same retry logic that dnf uses.

For now, improve on the retry logic by retrying until a max total time,
rather than an increasing timeout on each try. Up the given timeouts to
be one minute to connect and five minutes to complete the download. This
avoids hanging forever if the mirror is truly broken, but still gives
more time to finish the download than each iteration in the old code
did.

There are no new tests for this, as before this change the tests mostly
passed, and after it they will hopefully still mostly pass (but more
often).

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-06-07 22:08:34 +02:00
Christian Kellner
f967bf7164 sources/dnf: add documentation and schema
Since the dnf stage is not used anymore only a placeholder schema
and documentation is added.
2020-06-02 09:50:14 +02:00
Christian Kellner
42ef470740 sources/files: add documentation and schema
Add a brief documentation text and its JSON schema so that osbuild
can verify options org.osbuild.files source entries.
2020-06-02 09:50:14 +02:00
Christian Kellner
66d1dc1206 sources/ostree: add documentation and schema
Add a brief documentation text and its JSON schema so that osbuild
can verify options org.osbuild.ostree source entries.
2020-06-02 09:50:14 +02:00
David Rheinsberg
faaa6c1a6b modules: fix format-strings without interpolation
Fix all occurrences of format-strings without any interpolation. pylint
warns about those (and for some reason did not do so for our modules).
A followup will fix the pylint tests, so make sure all the warnings are
resolved.
2020-05-29 11:07:44 +02:00
David Rheinsberg
707ff8c988 sources: keep try-except block small
We used to have a try-except block to catch URL requests that are not in
`urls`. This block has since then grown way bigger than it should be. We
may now accidentally catch KeyError exceptions from lots of other
places.

This commit extracts the accessor of `urls[checksum]` and saves the
result in a local variable and makes the remainder use that variable.
2020-05-28 11:06:05 +02:00
David Rheinsberg
c337af6795 sources: fix indentation
Fix indentation to make pylint happy.
2020-05-28 11:06:05 +02:00
David Rheinsberg
84dcadc7d2 sources: convert f-string to normal string
Convert an f-string to a normal string, since we do not use any format
specifier in it.
2020-05-28 11:06:05 +02:00
Jacob Kozol
9cbedc0496 sources: fix break when secrets is None
When the urls' secrets field is not set, an error is thrown when trying
to get the name of the secrets. The secrets now have a default value of
{} when they are checked for the name.
2020-05-24 11:08:05 +02:00
Jacob Kozol
372b1174f2 sources: add rhsm secret support to files
When osbuild is given a manifest, the sources' urls can contain fields
for both a url path and a secret for that url. If the secret is
org.osbuild.rhsm the system's rhsm certificates are retrieved. These
certs are included when the files are curled.
2020-05-20 18:52:35 +02:00
Jacob Kozol
2309b54eb3 sources: reduce whitespace in files cp command 2020-05-20 18:52:35 +02:00
David Rheinsberg
4d2f15fb46 modules: drop osbuild symlink
Drop the `osbuild -> ../osbuild` symlink from all module directories.
We now properly initialize the PYTHONPATH to provide the imported
osbuild module from the host environment. Therefore, these links are no
longer needed.

The sources run from the host environment, so they should just pick them
up from the environment the same way osbuild itself does.
2020-05-04 12:32:25 +02:00
David Rheinsberg
2e039a778c sources/ostree: enable locked-access explicitly
Make sure access to the shared ostree metadata is locked properly. This
is the default since 2018.5, but lets be explicit here. This also makes
sure that the option exists and the local version supports locked and
protected access.

It is unclear whether the `ostree init` honors that as well. It really
should, and if it doesn't we can always report it upstream.
2020-04-27 16:53:43 +02:00
David Rheinsberg
58d368df0d osbuild: unify libdir handling
We want to run stages and other scripts inside of the nspawn containers
we use to build pipelines. Since our pipelines are meant to be
self-contained, this should imply that the build-root must have osbuild
installed. However, this has not been the case so far for several
reasons including:

  1. OSBuild is not packaged for all the build-roots we want to support
     and thus we have the chicken-and-egg problem.

  2. During testing and development, we want to support using a local
     `libdir`.

  3. We already provide an API to the container. Importing scripts from
     the outside just makes this API bigger, but does not change the
     fact that build-roots are not self-contained. Same is true for the
     running kernel, and probably much more..

With all this in mind, our strategy probably still is to eventually
package osbuild for the build-root. This would significantly reduce our
API exposure, points-of-failure, and host-reliance. However, this switch
might still be some weeks out.

With this in mind, though, we can expect the ideal setup to have a full
osbuild available in the build-root. Hence, any script we import so far
should be able to access the entire `libdir`. This commit unifies the
libdir handling by installing the symlinks into `libdir` and providing
a single bind-mount of the module-path into `libdir`.

We can always decide to scratch that in the future when we scratch the
libdir-import from the host-root. Until then, I believe this commit
nicely unifies the way we import the module both in a local checkout as
well as in the container.
2020-04-21 13:44:43 +02:00
Christian Kellner
b8b6619d39 sources/ostree: verify signature on local pull
Instead of verifying the gpg signature when pull from the actual
remote source into the local cache, verify the commit when it is
being pulled from the local cache into the output directory. This
ensures that the signatures are checked against the provided keys
even when the commit was already in the cache and at that time
the key might have been different.
NB: ostree expects the signature to be present on the remote at
the *target* repository, i.e. in our case the output repository.
The keys are therefore attached to a temporary remote that is
created at the output repository with the same name/id that is
used for the actual remote.
2020-04-15 15:39:45 +02:00
Christian Kellner
e1b2803ae0 sources/ostree: support gpg verification
Add a new `gpgkeys` option that, if set, must contain a list of
public keys. These keys will then be used by ostree to verify
signed commits when pulling from the remote. If the `gpgkeys`
option is missing, no verification will be attempted.
2020-04-15 15:39:45 +02:00
Christian Kellner
d5cce89fd8 sources: add org.ostree.ostree source
This source can be used to fetch ostree commits. The commits are
accessed via their commit is. The only option currently is `url`,
given for each commit, that will be used as the location of the
remote. A cache repository, that will be created if necessary,
acts as an intermediary, so remotes will be added with `name` as
the identifier to it and commits are pulled into that. In the
output directory another repository will be created as 'repo' and
the requested commit pulled into that from the cache repository via
a local pull.
2020-04-15 15:39:45 +02:00
Tom Gundersen
c8465ce06f sources/files: time-out curl
Add a 10s connection timeout for each file transfer. Also add an
increasing max timeout for a given file transfer (30s to 180s).

Also increase the retries to 10 and the concurrent threads to 15.

Hopefully this should make things a bit more stable in the face of
bad mirrors. We were encountering mirrors that would hang either
on connect or download at such slow speeds that they might as well
have stalled (~1kB in 45s).

Follow-up patches will provide a more long-term solution, by
allowing the same mirror selection as dnf currently uses.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-15 17:07:01 +01:00
Tom Gundersen
be0ff68411 fixup: files 2020-02-06 19:01:12 +01:00
Tom Gundersen
7817ae5e8b sources: add org.osbuild.files source
This source adds support for downloaded files. The files are
indexed by their content hash, and the only option is their URL.

The main usecase for this will be downloading rpms. Allowing depsolving
to be done outside of osbuild, network access to be restricted and
downloaded rpms to be reused between runs.

Each source is now passed two additional arguments, a cache directory
and an output directory. Both are in the source's namespace, and
the source is responsible for managing them. Each directory may
contain contents from previous runs, but neither is ever guaranteed
to do so.

Downloaded contents may be saved to the cache and resued between
runs, and the requested content should be written to the output dir.
If secrets are used, the source must only ever write contents to
the output that corresponds to the available secrets (rather than
contents from the cache from previous runs).

Each stage is passed an additional argument, a sources directory.
The directory is read-only, and contains a subdirectory named after
each used source, which will contain the requseted contents when
the `Get()` call returns (if the source uses this functionality).

Based on a patch by Lars Karlitski.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-06 19:01:12 +01:00
Lars Karlitski
59ffebaff0 stages,sources/dnf: allow passing certificate data
Add support for dnf's sslcacert, sslclientcert, and sslclientkey
options. The latter two are passed as secrets (clientcert as well
because it might be a pem file that also includes the private key).

Sources run on the host, so their options may contain paths to the host
file system. Make use of that by accepting only paths in those options,
because it allows using tools to deal with certificate files.

Also make sure that the dnf source only returns options it knows about.
2020-01-09 23:55:43 +01:00
Lars Karlitski
510e2b1e94 osbuild: introduce sources
Pipelines encode which source content they need in the form of
repository metadata checksums (or rpm checksums). In addition, they
encode where they fetch that source content from in the form of URLs.
This is overly specific and doesn't have to be in the pipeline's hash:
the checksum is enough to specify an image.

In practice, this precluded using alternative ways of getting at source
packages, such as local mirrors, which could speed up development.

Introduce a new osbuild API: sources. With it, a stage can query for a
way to fetch source content based on checksums.

The first such source is `org.osbuild.dnf`, which returns repository
configuration for a metadata checksum. Note that the dnf stage continues
to verify that the content it received matches the checksum it expects.

Sources are implemented as programs, living in a `sources` directory.
They are run on the host (i.e., uncontained) right now. Each source gets
passed options, which are taken from a new command line argument to
osbuild, and an array of checksums for which to return content.

This API is only available to stages right now.
2019-12-23 01:12:38 +01:00