Add a safeguard to ensure secure instances without valid
parent instances are terminated, as they are unnecessary to retain.
Typically, the parent does not exist if the secure instance is
older than 2 hours, but this check provides additional validation.
HMS-3632
This is a bit of an RFC commit, I noticed that when we discussed
a crash from the worker we looked at individual message from
syslog/journald for the stacktrace deatils. I was wondering if
having a more direct crash report would be more useful? We can
of course also add more logrus features to flag those with tags
like "crash" or something (I did not do that in this PR, I don't
know much about the operational side, sorry).
osbuild/images added an error type that's returned when the reporegistry
loader doesn't find any repository configurations to load [1]. This
lets callers decide whether to stop or continue execution based on
whether repository configurations are required.
A new top-level configuration option is added for osbuild-composer that
makes it possible to start the service without having static rpm
repositories configured. This is useful in certain (SaaS) modes where
build requests specify their own repository configurations.
Now and then there are leftover secure instances, probably when worker
instances get terminated during builds, this is possible in ASGs. 2
hours as a cutoff should be enough, since the build times out after 60
minutes, and fetching the output archive after 30 minutes, so that
leaves 30 minutes for booting and connection.
The `reporegistry.New()` has been enhanced to return an error, in case
there were no repositories loaded. This was to fix the situation in many
unit tests, which were previously not loading any repositories and
silently not running any tests.
This however broke our SaaS deployment, where we actually do not
configure any repositories on the filesystem. As a result,
osbuild-composer started to fail on it.
Workaround this situation in osbuild-composer by reverting to the old
behavior by loading the repo configs separately and then using the
loaded repos (which could be empty map) to initialize the RepoRegistry.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Extend the Koji target handling in the osbuild job, to also upload SBOM
documents attached to the related depsolve job result.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
If the Koji target result contains information about any uploaded SBOM
documents, import them to Koji as part of the finalize task.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Adjust all paces that call `Solver.Depsolve()`, to cope with the changes
that enabled SBOM support.
Fix loading of testing repositories in the CloudAPI unit tests.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
osbuild/images#751 wrapped the errors in the images/dnfjson package to
provide more details, the depsolve job should take this into account to
map the dnfjson error to the correct worker client error.
This caused user input errors errors to be misclassified as internal
errors, triggering depsolve job failure alerts.
Previously, the worker was determining the GCE image guest OS Features
on its own, based on the OS name. This caused problems, in case the
osbuild-composer was of a newer version than the worker.
Example:
osbuild-composer contained support for c10s GCE image type and its
implementation also contained the proper guest OS Features list for it.
However, when the worker got the osbuild job, it built it and tried to
fetch the guest OS Features for the distro. Since its implementation was
too old, it didn't contain the code that added the actual support for
c10s GCE images and got no guest OS features list (which is the default
for unsupported distros). The image was successfully uploaded and
shared, but it does not boot in GCP, because it does not know that it
should use UEFI to boot it.
This behavior could be considered a bug. The worker should be dumb. It
should not be making decisions about the image features, but instead it
should take them from the upload target options. And composer should be
the authoritative source of truth for this. Because otherwise, we
basically have two components that need to be updated in sync to add
support for GCE images on a new distro.
Move the GCE image guest OS features to the GCP upload target options.
The worker will just take what is specified there and use it when
importing the image to GCP. As a compatibility layer for the case when
the composer would be older than the worker (unlikely, but still),
worker will try to determine the image guest OS features in case the
list in the upload target options is empty.
Extend the GCP functional tests to check that the imported image has at
least some guest OS features set.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
The workerClientErrorFrom() was returning an `*clienterrors.Error` and
an `error` (if something with the conversation goes wrong.
But the calling code was expecting that even if an `error` is returned
the `*clienterrors.Error` is still valid. The caller would then just
log the error. As returning a valid `value` even when there is an
`error` is an unexpected pattern this commit changes the code to
always return a `*clienterrors.Error` and log any issue via the
logger.
The usual convention to create new object is to prefix `New*` so
this commit renames the `WorkerClientError`. Initially I thought
it would be `NewWorkerClientError()` but looking at the package
prefix it seems unneeded, i.e. `clienterrors.New()` already
provides enough context it seems and it's the only error we
construct.
We could consider renaming it to `clienterror` (singular) too
but that could be a followup.
I would also like to make `clienterror.Error` implement the
`error` interface but that should be a followup to make this
(mechanical) rename trivial to review.
Composer can load configuration values defined as map from ENV.
Previously, when loading the configuration from ENV, the whole map would
get overridden, not just values defined in the ENV. This is however not
intended and not consistent with how loading configuration from file
works.
Adjust the configuration loading from ENV and adjust the unit test
accordingly.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>