Commit graph

258 commits

Author SHA1 Message Date
Tomas Hozza
776a54135f worker: move osbuild exports from OSBuildJob to target
The osbuild export is specific to the upload target and different
targets may require using a different export. While osbuild-composer
still does not support multiple exports for osbuild jobs, this prepares
the ground for such support in the future.

The backward compatibility with older implementations of the composer
and workers is kept on the JSON (Un)mashaling level, where the JSON
message is always a super-set of the old and new way of providing the
exports to osbuild job.
2022-07-01 18:55:01 +01:00
Tomas Hozza
4e26ba82d0 worker: drop ImageName from the OSBuildJob struct
The `ImageName` in `OSBuildJob` is not used any more by any API
implementation 	or any worker job implementation. Drop it from the
structure.
2022-07-01 18:55:01 +01:00
Tomas Hozza
95e2e75851 worker/osbuild: stop handling VMDK stream-optimized conversion
A backward compatibility code handling the conversion of VMDK image to
stream-optimized sub-format has been kept in the implementation since
PR#2529 [1] merged on May 4th 2022. Since this change, no API
implementation is submitting jobs, which would hit this conversion code,
because VMDK images are already being produced in the desired
sub-format.

On-premise deployments are expected to use the same composer and worker
versions. There are no composer / worker instances in production, which
are not running the modified code.

Delete the backward compatibility code.

[1] https://github.com/osbuild/osbuild-composer/pull/2529
2022-07-01 18:55:01 +01:00
Tomas Hozza
6dcadc9d20 worker/osbuild: move target errors to detail of job error
Add a new worker client error type `ErrorTargetError` representing that
at least one of job targets failed. The actual target errors are added
to the job detail.

Add a new `OSBuildJobResult.TargetErrors()` method for gathering a slice
of target errors contained within an `OSBuildJobResult` instance. Cover
the method with unit test.
2022-07-01 18:55:01 +01:00
Tomas Hozza
59ded68457 worker: delete TargetErrors from OSBuildJobResult
The `TargetErrors` is not used any more since PR#2192 [1] and there is
no need to keep the backward compatibility any more, because there are
no composer / worker instances in production, which are not running the
modified code.

In addition, delete unit tests covering this legacy error handling.

[1] https://github.com/osbuild/osbuild-composer/pull/2192
2022-07-01 18:55:01 +01:00
Ondřej Budai
0693274ffe worker/server: set a job error when heartbeat gets missing
Previously, we just used an empty struct when heartbeat failed. This is fine
for the osbuild job because it's treated as a failed one when
result.OSBuildResult == false which is the default value.

koji-finalize works differently though: It's in a failed state if there's
an job error of kojiError != "". So when failed heartbeat set the struct to
be empty, this was treated as success because there's no error.

Let's fix this by introducing a new error for the situation where we don't get
a heartbeat in time for a specific job.
2022-06-29 16:44:10 +02:00
Tomas Hozza
bdf009f800 UploadJobArtifact(): return 400 if not accepting artifacts
The worker server API handler `UploadJobArtifact()` was previously
silently discarding artifacts uploaded by the worker, if the server was
configured to not accept artifacts.

Change the behavior to return HTTP error "Bad Request" (`400`) to the
worker, in case it tries to upload artifact to the server, but the
server is configured to not accept any artifacts.

Add a new unit test testing the new behavior and adjust existing unit
tests, which were relying on the artifact being previously silently
discarded.
2022-06-17 17:37:15 +02:00
Tomas Hozza
fc7d090498 cloudapi: add EnsureJobChannel() middleware to verify job channel
Add `EnsureJobChannel()` middleware method, intended for `compose/<id>`
endpoints. Its purpose is to ensure that the tenant channel set in
the request `echo.Context` matches the tenant channel associated with
the compose. In case of mismatch, `404` is returned.

Add `JobChannel()` method to the worker server implementation for
requesting channel associated with the job.
2022-06-10 14:48:18 +01:00
Tomas Hozza
db2ad7bc5f cloudapi: switch osbuild-koji -> osbuild for Koji build jobs
Switch to using `osbuild` job type with `koji` upload target for Koji
build jobs, instead of using `osbuild-koji` job type.

Modify unit tests accordingly.
2022-06-10 14:48:18 +01:00
Tomas Hozza
fc8af28231 worker/server: delete CheckBuildDependencies()
Replace all uses of `CheckBuildDependencies()` with
`JobDependencyChainErrors()` and delete `CheckBuildDependencies()`.
2022-06-10 14:48:18 +01:00
Tomas Hozza
fa37005a32 worker/server: add JobDependencyChainErrors() method
Add new `JobDependencyChainErrors()` method for gathering a stack trace
of job errors from the job's dependencies which caused it to fail.

The `JobDependencyChainErrors()` implementation uses job-type specific
`...Status()` methods intentionally, because job-type specific status
methods check the job's result in a slightly different way and set
the result.JobError to a specific value. Due to this reason, it would
not be practical to introduce a generic `JobStatus()` method and get rid
of the `switch` block, because in reality, the new method would have
to implement an equivalent `switch` block as well.

Add unit test covering the method functionality.
2022-06-10 14:48:18 +01:00
Tomas Hozza
5bd02f2f27 worker: treat ErrorKojiFailedDependency as a dependency error
The `ErrorKojiFailedDependency` was previously not treated as a
dependency error. Fix it.
2022-06-10 14:48:18 +01:00
Tomas Hozza
d9e4889866 worker: rename HasDependencyError() -> IsDependencyError()
Rename the `HasDependencyError()` method to `IsDependencyError()` to
better express what it does.
2022-06-10 14:48:18 +01:00
Tomas Hozza
66f7eaf440 worker/osbuild: check errors of all job dependencies
Ensure that none of the job dependencies failed. This covers the case
when there are more than one job dependencies, which will be the case
for Koji composes.
2022-06-10 14:48:18 +01:00
Tomas Hozza
97da1e7ad6 worker/osbuild: handle manifest dynamic argument index
Previously, the `OSBuild` job assumed that it can have only a single
job dependency, which could be only the `ManifestJobByID`. This won't
work well for the Koji use case, because the Koji OSBuild job has also
dependency on the Koji-init job.

Extend the `worker.OSBuildJob` structure with a new field, which holds
the `ManifestJobByIDResult` index in the job's dynamic arguments slice.
This value is considered in case when there is more than one dependency
of the `OSBuild` job.
2022-06-10 14:48:18 +01:00
Tomas Hozza
a4e6531565 worker: define job types as constants
Define supported job type names as constants and use them in all places,
instead of string literals.

There are multiple benefits of this approach. Using constants removed
the room for typos in the string literals. One can use autocompletion in
IDE for job types. Using constant makes it easier to find all references
where it is used and thus all places that are handling a specific job
type.
2022-06-10 14:48:18 +01:00
Tomas Hozza
69b9f115c9 worker: allow enqueueing OSBuild job with multiple dependencies
Change the definition of `EnqueueOSBuildAsDependency()` function to
accept a slice of job IDs on which the OSBuild job depends. Previously,
only the manifest job ID was accepted as the only possible dependency.
This change will be needed in order to enqueue OSBuild jobs for Koji,
which depends on two jobs.
2022-06-10 14:48:18 +01:00
Tomas Hozza
bb54318432 worker/osbuild: add host OS and architecture to job result
It is generally useful to have this information in the
`OSBuildJobResult`. This information is currently part of the
`OSBuildKojiJobResult`. Instead of moving it to the new
`KojiTargetResultOptions`, lets move it to the `OSBuildJobResult`
structure and set it for all jobs.
2022-06-10 14:48:18 +01:00
Chloe Kaubisch
873798514b prometheus: add tenant label
Include a tenant label for all prometheus metrics. Modify
jobstatus function in the worker accordingly to return channel
so it can be passed to prometheus.
2022-06-07 16:35:03 +02:00
Achilleas Koutsou
c8ce3e4428 worker: test depsolve job format compatibility
Test the conversion of the new and old DepsolveJob given the custom
marshaller.
The deserialised old format is not exactly the same as it would have
been before, but it is functionally equivalent, with the added benefit
of supporting depsolve jobs where we don't want base repositories to be
used by all depsolves.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
94c7fda779 worker: make DepsolveJob serialisation backwards compatible
Add custom marshaller for DepsolveJob that serialises the struct into a
format compatible with both the new and old formats.  The format on the
wire is a superset of both the new and old format and can be
deserialised into either while retaining all information.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
c092783a70 simplify package set chain handling
Move package set chain collation to the distro package and add
repositories to the package sets while returning the package sets from
their source, i.e., the ImageType.PackageSets() method.

This also removes the concept of "base repositories".  There are no
longer repositories that are added implicitly to all package sets but
instead each package set needs to specify *all* the repositories it will
be depsolved against.

This paves the way for the requirement we have for building RHEL 7
images with a RHEL 8 build root.  The build root package set has to be
depsolved against RHEL 8 repositories without any "base repos" included.
This is now possible since package sets and repositories are explicitly
associated from the start and there is no implicit global repository
set.

The change requires adding a list of PackageSet names to the core
rpmmd.RepoConfig.  In the cloud API, repositories that are limited to
specific package sets already contain the correct package set names and
these are now copied to the internal RepoConfig when converting types in
genRepoConfig().
The user-specified repositories are only associated with the payload
package sets like before.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
8a23a77c5b worker: add new error type for RepoError
dnf-json now returns a new error kind: RepoError
Add it to the list of known error types and handle it in the worker.
2022-06-01 11:36:52 +01:00
Tom Gundersen
4eeaebd40b prometheus/job: measure time spent pending rather than queued
We are interested in the time it takes from a job could be dequeued
until it is, but if a job has dependencies that are not yet finished, it
cannot be dequeued.

Change the logic to measure the time since the last dependency was
dequeued rather than when the job was queued.

The purpose of this metric is to have an alert fire in case we have too
few workers processing jobs.
2022-05-14 17:47:38 +01:00
Tom Gundersen
4621768c14 server/requestJob: record metrics last
This ensures that only if the dequeuing is successful are metrics recorded.
2022-05-14 17:47:38 +01:00
Tom Gundersen
ac642c3d70 server/requestJob: failing to read job status is fatal
Error out early in case reading a job status fails. The state would otherwise
be inconsistent if only some of the job statuses have been read out.
2022-05-14 17:47:38 +01:00
Tomas Hozza
0bf67dfad5 Stop setting the StreamOptimized option in Weldr and Cloud APIs
The VMDK image is already produced as stream-optimized. Therefore stop
setting the `StreamOptimized` option in `OSBuildJob` structure by both,
Weldr and Cloud APIs.

Keep the handling of the option in worker for backward compatibility,
in case an older instance of Composer server is used, which does not
produce VMDK manifests as stream-optimized. In such case, the worker
needs to convert the image.
2022-05-04 16:22:29 +02:00
Ondřej Budai
6fce34a5ea worker: add proxy support to composer and oauth calls
In the internal deployment, we want to talk with composer over a http/https
proxy. This proxy adds new composer.proxy field to the worker config that
causes the worker to connect to composer and the oauth server using
a specified proxy.

NB: The proxy is not supported when connection to composer via unix sockets.

For testing this, I added a small HTTP proxy implementation, pls don't
use this in production, it's just good enough for tests.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
9ee3997428 worker: use custom requester also for oauth refresh
Just so we can share e.g. proxy server or other http transport settings.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
71a4ceecaa worker/client: factor out common testing code
Just so we don't need to care about all the server-side setup in individual
test cases and we can just reuse the setup.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
b4d6ec5a75 worker/client: simplify the oauth test
Firstly, let's use t.TempDir(), it's less code.

Secondly, let's remove all the code that touches distributions, we can just
use random values, both worker server and client actually do't inspect
any values so they can be completely random.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
ed8bcd2f49 worker: move client test to its own file
This test actually verifies that the client code for OAuth works. As this was
the only code that tests client in the file, I think it deserves its own one.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Tomas Hozza
e819e08098 worker: extend the depsolve job to use DepsolvePackageSets()
Extend the `DepsolveJob` worker job argument to contain package sets
chains and use `DepsolvePackageSets()` for depsolving.
2022-04-28 14:42:49 +02:00
Gianluca Zuccarelli
e31fb36d65 cloudapi: add build job dependency checks
If an osbuild or koji-osbuild job has failed, add
a check to see if it is a result of the build jobs
dependencies and return the dependency failure job
error furthest up the chain of errors & add this
error to the details filed of the build job error.
2022-04-13 10:31:53 +02:00
Gianluca Zuccarelli
da94f2cbeb worker/server: build job dep errors
Add a helper function to query dependency
failures of osbuild & koji-osbuild jobs.
If a build job has a dependency error the
function will check for the job error of the
manifest job. If that also has a dependency
error the function will query the depsolve
job too for a job error.
2022-04-13 10:31:53 +02:00
Gianluca Zuccarelli
30d75d0e74 worker/clienterrors: depenency error check
Add a helper function to check for dependency
errors for job errors. This simply returns true
if a job error has a dependency error code and
false otherwise.
2022-04-13 10:31:53 +02:00
Gianluca Zuccarelli
b1969ba6a6 worker/clienterrors: omit details if empty
Omit the details field if it is null/empty.
2022-04-13 10:31:53 +02:00
Gianluca Zuccarelli
ab98c66b9f worker/server: fix manifest-id job status
The manifest by id job status type safe function
was failing due to the jobType check which was checking
for the wrong string.
2022-04-06 21:34:02 +01:00
Gianluca Zuccarelli
b75cf30a05 worker/server: remove duplicate function
The `ManifestJobStatus` and `ManifestByIdJobStatus` both
had identical functionality. The `ManifestByIdJobStatus`
is not being referenced anywhere in the codebase and so
this function has been removed.
2022-04-06 21:34:02 +01:00
Gianluca Zuccarelli
14b006d480 worker/clienterrors: add empty packagespec error
Add an error case for an empty package spec returned
by a depsolve job and mark this with a `4xx` status.
2022-04-06 21:34:02 +01:00
Gianluca Zuccarelli
cc7d555fb2 worker/errors: consider dep errors as 4xx status
All dependency errors, whether they are 4xx or 5xx,
are currently being considered as a 5xx error in parent
jobs. This is causing some of the build alerts to fire
off when a depsolve job has failed, for example, when
in reality, this is an expected result. This commit
ensures that dependency errors are being reported as
4xx status in monitoring.
2022-04-06 10:57:37 +02:00
Gianluca Zuccarelli
8241e1f948 worker/clienterrors: add empty manifest error
If a manifest is empty we should have a specific error
code for that case and treat it as a 4xx error since
this would be bad input for a build job
2022-04-06 10:57:37 +02:00
Eng Zer Jun
00ea3eb285 test: use T.TempDir to create temporary test directory
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-04-05 09:27:43 +02:00
Sanne Raymaekers
2023f7731d worker: Support client_credentials grant type in client
This will allow us to use the service accounts which work against
identity.api.openshift.com. These are much easier to manage, especially
with the new multi-tenancy, as there's a single page to create/expire
them across an account.

They also have the added benefit of not expiring automatically when
they're not used like offline tokens, and immediate expiration when
desired.
2022-03-21 09:43:43 +01:00
Sanne Raymaekers
8900bcec40 worker: Client lazy token refresh 2022-03-21 09:43:43 +01:00
Sanne Raymaekers
8a6d6ed6cf worker: Clean up worker client config 2022-03-21 09:43:43 +01:00
Sanne Raymaekers
318a4525c6 cmd/osbuild-worker: dnf-json returns MarkingErrors (plural) 2022-03-11 10:13:27 +01:00
Gianluca Zuccarelli
761aab6cac cloudapi/v2: add error object to ImageStatus
Add an error object to the ComposeStatus.ImageStatus.
The error object contains a human-readable error reason
and optional details in the case of an error.
2022-03-09 08:49:37 +00:00
Ondřej Budai
cfb756b9ba api/{cloud,worker}: used channel name based on JWT claims for new jobs
This commit implements multi-tenancy. A tenant is defined based on a value
from JWT claims. The key of this value must be specified in the configuration
file. This allows us to pick different values when using multiple SSOs.

Let me explain more in depth how this works:

Cloud API gets a new compose request. Firstly, it extracts a tenant name from
JWT claims. The considered claims are configured as an array in
cloud_api.jwt.tenant_provider_fields in composer's config file. The channel
name for all jobs belonging to this compose is created by `"org-" + tenant`.

Why is the channel prefixed by "org-"? To give us options in the future. I can
imagine the request having a channel override. This basically means that
multiple tenants can share a channel. A real use-case for this is multiple
Fedora projects sharing one pool of workers.

Why this commit adds a whole new cloud_api section to the config? Because the
current config is a mess and we should stop adding new stuff into the koji
section. As the Koji API is basically deprecated, we will need to remove it
soon nevertheless.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-03-08 12:07:00 +01:00
Ondřej Budai
c1dc58eba4 worker: NewServer: move config parameters to a new Config struct
We will have more parameters soon so let's make this prettier sooner rather
than later.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-03-08 12:07:00 +01:00