Commit graph

1002 commits

Author SHA1 Message Date
Florian Schüller
65b7ee65b2 osbuild-service-maintenance: implement removal of launch templates
Launch templates of instances that are terminated should be removed.
HMS-3632
2024-12-10 11:43:51 +01:00
Florian Schüller
a96ea533c0 osbuild-service-maintenance: implement removal of security groups
Security groups of instances that are terminated should be removed.
HMS-3632
2024-12-10 11:43:51 +01:00
Florian Schüller
7ebe266d3c osbuild-service-maintenance: implement removal on invalid parent
Add a safeguard to ensure secure instances without valid
parent instances are terminated, as they are unnecessary to retain.
Typically, the parent does not exist if the secure instance is
older than 2 hours, but this check provides additional validation.
HMS-3632
2024-12-10 11:43:51 +01:00
Florian Schüller
a7119a4d0f osbuild-service-maintenance/aws: support aws credential file
Support running the maintenance locally with a valid
`~/aws/credentials` file. HMS-3632
2024-12-10 11:43:51 +01:00
Tomáš Hozza
580366d1f3 osbuild-dnf-json-tests: don't set OSTree options for non-OSTree images
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-12-09 09:46:54 +01:00
Lukas Zapletal
e7a7cda3bc cmd: extra env logging for osbuild worker 2024-11-29 11:57:11 +01:00
Michael Vogt
bc7b8355bf worker: report cashes directly to logrus
This is a bit of an RFC commit, I noticed that when we discussed
a crash from the worker we looked at individual message from
syslog/journald for the stacktrace deatils. I was wondering if
having a more direct crash report would be more useful? We can
of course also add more logrus features to flag those with tags
like "crash" or something (I did not do that in this PR, I don't
know much about the operational side, sorry).
2024-11-25 12:02:05 +01:00
Sanne Raymaekers
f672610509 cmd/osbuild-worker: specify hyper v gen for azure images 2024-11-21 11:22:20 +01:00
Lukas Zapletal
03e74e77b2 worker: log proxy setting 2024-11-18 19:33:19 +01:00
Lukas Zapletal
86f903339a worker: parse ostree MTLS proxy early 2024-11-15 10:16:26 +01:00
Lukas Zapletal
2a5d25d9c0 worker: check MTLS config for ostree 2024-11-12 12:12:52 +01:00
Sanne Raymaekers
056b3c5ea6 jobqueue: return if a job was requeued or not 2024-11-07 17:18:48 +01:00
Lukas Zapletal
64f479092d osbuild-worker: use the new ostree resolver API 2024-11-07 16:17:56 +01:00
Florian Schüller
00d3f07d08 Makefile: implement make db-tests
enables the option to run the DB tests locally
that are executed in the github actions
2024-11-06 15:16:42 +01:00
Achilleas Koutsou
af48971981 osbuild-composer: fail weldr init when repos are nil
If weldr tries to initialise when there are no repositories set and
ignore_missing_repos is enabled, return with an error.
2024-11-05 08:21:42 +01:00
Achilleas Koutsou
51287ea57e osbuild-composer/config: new option: ignore_missing_repos
osbuild/images added an error type that's returned when the reporegistry
loader doesn't find any repository configurations to load [1].  This
lets callers decide whether to stop or continue execution based on
whether repository configurations are required.

A new top-level configuration option is added for osbuild-composer that
makes it possible to start the service without having static rpm
repositories configured.  This is useful in certain (SaaS) modes where
build requests specify their own repository configurations.
2024-11-05 08:21:42 +01:00
Sanne Raymaekers
c1b67440c4 cmd/osbuild-service-maintenance: respect dry run
Respect dry run when terminating leftover SIs.
2024-10-28 10:59:25 +01:00
Lukas Zapletal
350ad58c31 worker: use the new resolver API 2024-10-24 11:53:04 +02:00
Sanne Raymaekers
661f39cbb9 cmd/osbuild-service-maintenance: add test for filtering SIs 2024-10-23 10:32:57 +02:00
Sanne Raymaekers
04a5ca6965 cmd/osbuild-service-maintenance: clean up secure instances
Now and then there are leftover secure instances, probably when worker
instances get terminated during builds, this is possible in ASGs. 2
hours as a cutoff should be enough, since the build times out after 60
minutes, and fetching the output archive after 30 minutes, so that
leaves 30 minutes for booting and connection.
2024-10-23 10:32:57 +02:00
Tomáš Hozza
7437770352 composer: don't create RepoRegistry using reporegistry.New()
The `reporegistry.New()` has been enhanced to return an error, in case
there were no repositories loaded. This was to fix the situation in many
unit tests, which were previously not loading any repositories and
silently not running any tests.

This however broke our SaaS deployment, where we actually do not
configure any repositories on the filesystem. As a result,
osbuild-composer started to fail on it.

Workaround this situation in osbuild-composer by reverting to the old
behavior by loading the repo configs separately and then using the
loaded repos (which could be empty map) to initialize the RepoRegistry.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-09-23 18:51:39 +02:00
Tomáš Hozza
71a12742d4 Worker/osbuild/koji: upload SBOM documents
Extend the Koji target handling in the osbuild job, to also upload SBOM
documents attached to the related depsolve job result.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-09-20 17:02:09 +02:00
Tomáš Hozza
1c7462b275 Worker/koji-finalize: import uploaded SBOM documents
If the Koji target result contains information about any uploaded SBOM
documents, import them to Koji as part of the finalize task.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-09-20 17:02:09 +02:00
Tomáš Hozza
4779e90e17 Worker/depsolve: add support for SBOM
Add support to the `DepsolveJob` for requesting SBOM documents and
returning the results from the job.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-09-20 17:02:09 +02:00
Tomáš Hozza
7bdd036395 Update osbuild/images to v0.88.0
Adjust all paces that call `Solver.Depsolve()`, to cope with the changes
that enabled SBOM support.

Fix loading of testing repositories in the CloudAPI unit tests.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-09-20 17:02:09 +02:00
Sanne Raymaekers
22a0452ea9 osbuild-worker: handle error wrapping from dnfjson package
osbuild/images#751 wrapped the errors in the images/dnfjson package to
provide more details, the depsolve job should take this into account to
map the dnfjson error to the correct worker client error.

This caused user input errors errors to be misclassified as internal
errors, triggering depsolve job failure alerts.
2024-09-02 14:39:03 +02:00
Tomáš Hozza
d7e59e6eec Worker: move GCE image guest OS features to upload target options
Previously, the worker was determining the GCE image guest OS Features
on its own, based on the OS name. This caused problems, in case the
osbuild-composer was of a newer version than the worker.

Example:
osbuild-composer contained support for c10s GCE image type and its
implementation also contained the proper guest OS Features list for it.
However, when the worker got the osbuild job, it built it and tried to
fetch the guest OS Features for the distro. Since its implementation was
too old, it didn't contain the code that added the actual support for
c10s GCE images and got no guest OS features list (which is the default
for unsupported distros). The image was successfully uploaded and
shared, but it does not boot in GCP, because it does not know that it
should use UEFI to boot it.

This behavior could be considered a bug. The worker should be dumb. It
should not be making decisions about the image features, but instead it
should take them from the upload target options. And composer should be
the authoritative source of truth for this. Because otherwise, we
basically have two components that need to be updated in sync to add
support for GCE images on a new distro.

Move the GCE image guest OS features to the GCP upload target options.
The worker will just take what is specified there and use it when
importing the image to GCP. As a compatibility layer for the case when
the composer would be older than the worker (unlikely, but still),
worker will try to determine the image guest OS features in case the
list in the upload target options is empty.

Extend the GCP functional tests to check that the imported image has at
least some guest OS features set.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-08-29 17:37:48 +02:00
Florian Schüller
8d24dcfbde osbuild-worker: add CHANNEL to worker logs
aka "the deployment channel" like "staging" or "production"
2024-08-28 16:41:07 +02:00
Florian Schüller
a4068b328d splunk_logger: move environment hook to splunk_logger pt2
for reusability also in image-builder
2024-08-28 16:41:07 +02:00
Sanne Raymaekers
54820a88df osbuild-worker: switch to aws sdk v2 for errors in ami copy jobs 2024-08-20 15:32:40 +02:00
Sanne Raymaekers
2624516f1a osbuild-worker: use aws sdk v2 for asg scale-in protection 2024-08-20 15:32:40 +02:00
Sanne Raymaekers
990ed6a9ad osbuild-uploadgeneric-s3: remove aws sdk v1 dependency 2024-08-20 15:32:40 +02:00
Sanne Raymaekers
cda0db91f5 osbuild-upload-aws: remove aws sdk v1 dependency 2024-08-20 15:32:40 +02:00
Sanne Raymaekers
fa3b203178 internal/boot: adapt to aws sdk v2 2024-08-20 15:32:40 +02:00
Sanne Raymaekers
c87cbe0cbc osbuild-service-maintenance: adapt to aws sdk v2 2024-08-20 15:32:40 +02:00
Florian Schüller
09c5f5e374 osbuild-composer: activate deployment-channel reporting for splunk
followup of PR #4285
2024-08-12 15:38:56 +02:00
Florian Schüller
9006836afc logrus: add deployment channel as field to the logs 2024-08-07 12:32:57 +02:00
Michael Vogt
1d0232ffc6 osbuild-worker: rework the workerClientErrorFrom() error
The workerClientErrorFrom() was returning an `*clienterrors.Error` and
an `error` (if something with the conversation goes wrong.

But the calling code was expecting that even if an `error` is returned
the `*clienterrors.Error` is still valid. The caller would then just
log the error. As returning a valid `value` even when there is an
`error` is an unexpected pattern this commit changes the code to
always return a `*clienterrors.Error` and log any issue via the
logger.
2024-08-01 17:25:16 +02:00
Michael Vogt
573b349f16 clienterrors: rename WorkerClientError to clienterrors.New
The usual convention to create new object is to prefix `New*` so
this commit renames the `WorkerClientError`. Initially I thought
it would be `NewWorkerClientError()` but looking at the package
prefix it seems unneeded, i.e. `clienterrors.New()` already
provides enough context it seems and it's the only error we
construct.

We could consider renaming it to `clienterror` (singular) too
but that could be a followup.

I would also like to make `clienterror.Error` implement the
`error` interface but that should be a followup to make this
(mechanical) rename trivial to review.
2024-07-31 17:04:58 +02:00
Tomáš Hozza
286236b698 Config: don't override undefined keys when loading from ENV
Composer can load configuration values defined as map from ENV.
Previously, when loading the configuration from ENV, the whole map would
get overridden, not just values defined in the ENV. This is however not
intended and not consistent with how loading configuration from file
works.

Adjust the configuration loading from ENV and adjust the unit test
accordingly.

Signed-off-by: Tomáš Hozza <thozza@redhat.com>
2024-07-17 11:02:41 +02:00
Michael Vogt
919b423953 osbuild-worker: tweak error to not include a \n for a failed stage
Small followup for
https://github.com/osbuild/osbuild-composer/pull/4113#discussion_r1670063775

Given that the failed stage is a relatively short string the `\n`
seems unneccessary and quotes are enough.
2024-07-11 09:33:40 +02:00
Florian Schüller
7cd5abd17c cmd/osbuild-worker/jobimpl-depsolve: show error.Reason only once
as now the .Reason is properly passed over - it was printed twice
2024-07-09 12:12:36 +02:00
Florian Schüller
b0a737421a osbuild-worker: improve error "reason" in case of stage failures 2024-07-09 12:12:36 +02:00
Lukas Zapletal
5ce8f65a58 cloudapi: propagate operation/external id
Signed-off-by: Lukas Zapletal <lzap+git@redhat.com>
2024-06-25 13:58:53 +02:00
Lukas Zapletal
f3c0daebbf cmd/osbuild-composer: journald support 2024-06-25 13:58:53 +02:00
Sanne Raymaekers
2a621521a8 osbuildexecutor/aws.ec2: set hostname of executor via cloud-init
This way much more of the journal will be captured under the new
hostname.
2024-06-25 10:58:10 +02:00
Sanne Raymaekers
4853bf3ec0 Revert "osbuild-worker-executor: job-id in control.json as hostname"
This reverts commit fc1d1c3b8f.
2024-06-25 10:58:10 +02:00
Florian Schüller
55c5602f91 osbuild-worker-executor/main_test: use random port for tests
this for sure is racy but better than colliding with other tests
with a fixed port for sure
2024-06-24 09:18:44 +02:00
Michael Vogt
aa3d70a429 osbuildexecutor: tweak RunOSBuild() signature and use opts
Introduce a new OsbuildOpts struct to make the API a bit easier
to extend and use in the packages.

Also add a new `JobID` field in the `OsbuildOpts`.
2024-06-14 15:02:08 +02:00
Michael Vogt
fc1d1c3b8f osbuild-worker-executor: job-id in control.json as hostname
This commit adds support to set the hostname to the job-id that
is part of the control.json.
2024-06-14 15:02:08 +02:00