This commit implements multi-tenancy. A tenant is defined based on a value
from JWT claims. The key of this value must be specified in the configuration
file. This allows us to pick different values when using multiple SSOs.
Let me explain more in depth how this works:
Cloud API gets a new compose request. Firstly, it extracts a tenant name from
JWT claims. The considered claims are configured as an array in
cloud_api.jwt.tenant_provider_fields in composer's config file. The channel
name for all jobs belonging to this compose is created by `"org-" + tenant`.
Why is the channel prefixed by "org-"? To give us options in the future. I can
imagine the request having a channel override. This basically means that
multiple tenants can share a channel. A real use-case for this is multiple
Fedora projects sharing one pool of workers.
Why this commit adds a whole new cloud_api section to the config? Because the
current config is a mess and we should stop adding new stuff into the koji
section. As the Koji API is basically deprecated, we will need to remove it
soon nevertheless.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Channels are a concept similar to job types. Callers must specify a channel
name when queueing a new job. A list of channels is also specified when
dequeueing a job. The dequeued job's channel will always be from one of the
specified channel. Of course, the job types are also respected. The dequeued
job will also always be from one of the specified type.
Currently, all calls to jobqueue were changed so all queue operations use
an empty channel name and all dequeue operations use a list containing
an empty channel.
Thus, this is a non-functional change.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
We have two fields, `Repos` and `PackageSets`. Renaming
`PackageSetsRepositories` to `PackageSetsRepos` for consistency.
The struct is for internal use only so the rename has no impact as long
as the serialised name is the same (json tag).
Also it's shorter.
Added docstring to the struct that explains the arguments in the same
way as they are described for the `depsolve()` function.
Changing the name of the argument in the internal `depsolve()` function
for the same reasons.
This commit introduces the collection of error
metrics since it is now possible to differentiate
between internal errors and user input errors.
Additionally, the error status is reported for
job duration metrics.
For simplicity, the collection of the job metrics
was carried out in the job queue. This was only
being done in the dbqueue and not in the fsqueue.
This pr refactors the metric collection and moves
the job metrics to the worker server, by adding a
wrapper function to enqueueing jobs so that the
metrics only have to be recorded in one place when
queueing a job.
Replace Job() and JobStatus() with typesafe versions, and introduce JobType()
for the rare instances where we don't know the type up front.
Additionally, catch a few more error cases:
- if OSBuildResult is nil, then we failed to invoke osbuild
- make sure the same JobResult handling is done for osbuild-koji, as for osbuild
Extend the compose endpoints to have minimal koji support.
This is intended to replace the current koji API so that it
can be consumed through api.openshift.com.
Since the workers will use structured error messages
going forward, it is necessary to maintain backwards
compatability for there errors in composer. Tests have
been added to the various apis to ensure that each api
checks for both kinds of errors, old and new.
Implement the structured errors as defined by the worker client.
Every error for each of the job types now returns a structured
error with a reason and a specific error code. This will make
it possible to differentiate between 4xx errors and 5xx errors.
This commit refactors the way errors are implemented in the workers,
but maintains backwards compatability in composer by checking for
both kinds of errors.
Define worker errors to give more structured
error messages. The error api is:
id: VALIDATION_ERROR_NUMBER, reason: STRING, details: { issues: [{...}, {...}] }
The api was agreed upon with osbuild so that,
in future, osbuild errors will share the same
structure
Two new tests, one for OSBuild and one for Koji jobs. Both follow the
same flow:
- Enqueue a job that doesn't specify PipelineNames (oldJob)
- Enqueue a job that does specify PipelineNames (newJob)
- Read the job data for the oldJob and check that the default
PipelineNames were added
- Read the job data for the newJob and check that it's unchanged
- Finish oldJob and add results without specifying PipelineNames
- Finish newJob and add results with PipelineNames
- Read the oldJob result and check that the default PipelineNames were
added
- Read the newJob result and check that it's unchanged
This is meant to test several scenarios that can occur when upgrading
the service:
1. The existing jobqueue has old jobs in it that were queued before the
PipelineNames were part of the data structure. The worker should be
able to read these and add the fallback data.
2. New jobs are added while old jobs still exist in the queue and the
worker can read both types.
3. The existing jobqueue has old finished jobs in it that were finished
and had results written before the PipelineNames were part of the
result data structure. The worker should be able to read these and
add the fallback data.
4. New jobs are finished and results are written while old jobs still
exist in the queue and the worker can read both result types.
When worker reading data into the job and result types, check if the
PipelineNames are populated and, if not, add the fallback values from
distro.
This makes it simpler to work with job queues that contain old data
before the introduction of the PipelineNames. In any situation where the
job or result data are read, the reader can assume that the
PipelineNames are non-nil and that if they belong to an old job, they
have the fallback names.
This assumption goes hand-in-hand with the change in v2 format for
osbuild results, since old jobs that don't have PipelineNames set *must*
contain results in the old format for the names to be valid.
The names of the pipelines that make up a Manifest for a job are
attached to the job data that is stored in the queue. The pipelines are
separated into Build and Payload.
This information is useful for identifying the build pipeline results
and metadata and for the order of the pipelines as they appeared in the
manifest.
Reading stage metadata using osbuild's v2 result format.
For RPM stages we only want the core (OS) RPMs (not the build root
RPMs). Skip the build pipeline by name, but this should be handled
better since names are arbitrary.
Using type switch to convert metadata types instead of relying on the
type string of the stage result.
The rpmmd helper function isn't used anymore since that requires two
conversion passes (osbuild.StageMetadata -> rpmmd.RPM ->
cloudapi.PackageMetadata).
Signed-off-by: Achilleas Koutsou <achilleas@koutsou.net>
This is backwards compatible, as long as the timeout is 0 (never
timeout), which is the default.
In case of the dbjobqueue the underlying timeout is due to
context.Canceled, context.DeadlineExceeded, or net.Error with Timeout()
true. For the fsjobqueue only the first two are considered.
Allow depsolving to be done in a worker through the job queue rather
than synchronously in composer.
The benefit this might unlock include:
- no more blocking calls in the cloud/koji APIs
- only workers accessing repositoires
- no VPN access from composer
- composer not needing to be subscribed to CDN, etc
- no dnf cache managment in composer
Potential problems:
- the version of composer (so the distro definitions) that
triggered a depsolve, may not be the same that uses the
result to generate a manfiset
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The main changes are:
- Kind, Href, Id fields for every object returned
- Attach operationIds to each request, return it for errors
- Errors are predefined and queryable
2 configurations for the listeners are now possible:
- enableJWT=false with client ssl auth
- enableJWT=true with https
Actual verification of the tokens is handled by
https://github.com/openshift-online/ocm-sdk-go.
An authentication handler is run as the top level handler, before any
routing is done. Routes which do not require authentication should be
listed as exceptions.
Authentication can be restricted using an ACL file which allows
filtering based on JWT claims. For more information see the inline
comments in ocm-sdk/authentication.
As an added quirk the `-v` flag for the osbuild-composer executable was
changed to `-verbose` to avoid flag collision with glog which declares
the `-v` flag in the package `init()` function. The ocm-sdk depends on
glog and pulls it in.
An occupied worker checks about every 15 seconds if it's current job was
cancelled. Use this to introduce a heartbeat mechanism, where if
composer hasn't heard from the worker in 2 minutes, the job times out
and is set to fail.
Koji image request handling now reads the exports defined by each image
type. All APIs now support reading the exports defined by each image
type. The worker still falls back to "assembler" in case the call comes
from an older version of composer.
fedoratest was yet another dummy distribution used by unit tests. After
the rework of test_distro, there is no reason to not use it as the only
distro implementation for testing purposes.
Remove fedoratest distro and replace it with test_distro in all affected
tests.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
osbuild now supports using the `--export` flag (can be invoked multiple
times) to request the exporting of one or more artefacts. Omitting it
causes the build job to export nothing.
The Koji API doesn't support the new image types (yet) so it simply uses
the "assembler" name, which is the final stage of the old (v1)
Manifests.
This replaces Packages() and BuildPackages() by returning a map of
package sets, the semantics of which is up to the distro to define.
They are meant to be depsolved and the result returned back as a
map to Manifest(), with the same keys.
No functional change.
Signed-off-by: Tom Gundersen <teg@jklm.no>