Commit graph

102 commits

Author SHA1 Message Date
Diaa Sami
7c52db1ae1 worker/api: align & improve error handlers 2022-02-02 11:15:20 +01:00
Tom Gundersen
c81d0d08ec cloudapi/v2: support logs/manifests endpoints
For now these only work for koji builds.
2022-02-01 20:28:40 +00:00
Tom Gundersen
b32ab36e1d worker/server: typesafe Job and JobStatus
Replace Job() and JobStatus() with typesafe versions, and introduce JobType()
for the rare instances where we don't know the type up front.

Additionally, catch a few more error cases:
 - if OSBuildResult is nil, then we failed to invoke osbuild
 - make sure the same JobResult handling is done for osbuild-koji, as for osbuild
2022-02-01 20:28:40 +00:00
Tom Gundersen
92c7fc2534 cloupapi/v2: add koji support
Extend the compose endpoints to have minimal koji support.

This is intended to replace the current koji API so that it
can be consumed through api.openshift.com.
2022-02-01 20:28:40 +00:00
Gianluca Zuccarelli
88b5529cc4 osbuild-worker: test error backwards compatability
Since the workers will use structured error messages
going forward, it is necessary to maintain backwards
compatability for there errors in composer. Tests have
been added to the various apis to ensure that each api
checks for both kinds of errors, old and new.
2022-01-27 16:45:14 +01:00
Gianluca Zuccarelli
cc981b887a osbuild-worker: implement structured errors
Implement the structured errors as defined by the worker client.
Every error for each of the job types now returns a structured
error with a reason and a specific error code.  This will make
it possible to differentiate between 4xx errors and 5xx errors.

This commit refactors the way errors are implemented in the workers,
but maintains backwards compatability in composer by checking for
both kinds of errors.
2022-01-27 16:45:14 +01:00
Gianluca Zuccarelli
daf24f8db3 worker: define worker errors
Define worker errors to give more structured
error messages. The error api is:
id: VALIDATION_ERROR_NUMBER, reason: STRING, details: { issues: [{...}, {...}] }

The api was agreed upon with osbuild so that,
in future, osbuild errors will share the same
structure
2022-01-27 16:45:14 +01:00
sanne
a83cf95d5b go.mod: Update oapi-codegen and kin-openapi 2022-01-12 11:35:06 +01:00
sanne
8406ada6f5 worker: Treat a non echo.HTTPError like a regular error 2021-12-17 13:13:05 +01:00
Diaa Sami
510d2ccac0 worker/server: pass more error details to handler 2021-12-16 11:58:41 +00:00
Diaa Sami
c1aeeeaf0e internal/worker: log internal details when available 2021-12-16 11:58:41 +00:00
Djebran Lezzoum
c93ea748a2 distro/depsolve/cloudapi: Add 3rd-party repository support.
Allow 3rd-party repositories to be supported and custom packages installed.
Fixes #COMPOSER-1273
2021-12-15 20:12:49 +01:00
Gianluca Zuccarelli
91f2457363 metrics: add prometheus namespaces
Make use of the prometheus namespace and subsystem
to give the metrics a consistent namespaces in openshift.
2021-11-19 22:48:25 +01:00
sanne
028eca1b26 cloudapi/v2: Use manifest-id-only job
job dependencies:
depsolve -> manifest -> osbuild

This allows the compose handler to return the osbuild job id
immediately.
2021-11-18 10:26:17 +01:00
sanne
b075cac9e3 worker: Correct servers in openapi spec
Similar to other services on api.openshift.com, the full url should be
shown.
2021-11-16 10:30:58 +00:00
Achilleas Koutsou
778b2de3c0 worker: test mixed new and old jobs in jobqueue
Two new tests, one for OSBuild and one for Koji jobs. Both follow the
same flow:
- Enqueue a job that doesn't specify PipelineNames (oldJob)
- Enqueue a job that does specify PipelineNames (newJob)
- Read the job data for the oldJob and check that the default
  PipelineNames were added
- Read the job data for the newJob and check that it's unchanged
- Finish oldJob and add results without specifying PipelineNames
- Finish newJob and add results with PipelineNames
- Read the oldJob result and check that the default PipelineNames were
  added
- Read the newJob result and check that it's unchanged

This is meant to test several scenarios that can occur when upgrading
the service:
1. The existing jobqueue has old jobs in it that were queued before the
   PipelineNames were part of the data structure. The worker should be
   able to read these and add the fallback data.
2. New jobs are added while old jobs still exist in the queue and the
   worker can read both types.
3. The existing jobqueue has old finished jobs in it that were finished
   and had results written before the PipelineNames were part of the
   result data structure. The worker should be able to read these and
   add the fallback data.
4. New jobs are finished and results are written while old jobs still
   exist in the queue and the worker can read both result types.
2021-11-16 09:49:37 +01:00
Achilleas Koutsou
51870676cc worker/json: add fallback pipeline names when reading data
When worker reading data into the job and result types, check if the
PipelineNames are populated and, if not, add the fallback values from
distro.

This makes it simpler to work with job queues that contain old data
before the introduction of the PipelineNames. In any situation where the
job or result data are read, the reader can assume that the
PipelineNames are non-nil and that if they belong to an old job, they
have the fallback names.

This assumption goes hand-in-hand with the change in v2 format for
osbuild results, since old jobs that don't have PipelineNames set *must*
contain results in the old format for the names to be valid.
2021-11-16 09:49:37 +01:00
Achilleas Koutsou
143eb5cb91 worker: add PipelineNames to Job descriptions
The names of the pipelines that make up a Manifest for a job are
attached to the job data that is stored in the queue. The pipelines are
separated into Build and Payload.

This information is useful for identifying the build pipeline results
and metadata and for the order of the pipelines as they appeared in the
manifest.
2021-11-16 09:49:37 +01:00
Achilleas Koutsou
2004c71f89 cloudapi: use osbuild v2 result struct to extract metadata
Reading stage metadata using osbuild's v2 result format.
For RPM stages we only want the core (OS) RPMs (not the build root
RPMs). Skip the build pipeline by name, but this should be handled
better since names are arbitrary.

Using type switch to convert metadata types instead of relying on the
type string of the stage result.

The rpmmd helper function isn't used anymore since that requires two
conversion passes (osbuild.StageMetadata -> rpmmd.RPM ->
cloudapi.PackageMetadata).

Signed-off-by: Achilleas Koutsou <achilleas@koutsou.net>
2021-11-16 09:49:37 +01:00
sanne
6757916c54 worker: Introduce manifest-id-only job
A job intended to run in composer itself, after which a dependant
osbuild job can parse the manifest from it's dynamic arguments.
2021-11-15 16:04:12 +01:00
sanne
d25ae71fef worker: Configurable timeout for RequestJob
This is backwards compatible, as long as the timeout is 0 (never
timeout), which is the default.

In case of the dbjobqueue the underlying timeout is due to
context.Canceled, context.DeadlineExceeded, or net.Error with Timeout()
true. For the fsjobqueue only the first two are considered.
2021-10-19 00:12:18 +01:00
Ondřej Budai
e904397fdb cloudapi/v2: Use worker to depsolve
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-10-11 13:16:51 +02:00
Tom Gundersen
0f90aa9c78 worker: Add a depsolve job type
Allow depsolving to be done in a worker through the job queue rather
than synchronously in composer.

The benefit this might unlock include:
 - no more blocking calls in the cloud/koji APIs
 - only workers accessing repositoires
   - no VPN access from composer
   - composer not needing to be subscribed to CDN, etc
 - no dnf cache managment in composer

Potential problems:
 - the version of composer (so the distro definitions) that
   triggered a depsolve, may not be the same that uses the
   result to generate a manfiset

Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-10-11 13:16:51 +02:00
sanne
ce7ac9a756 worker: Make BasePath configurable 2021-10-11 09:52:21 +02:00
Diaa Sami
12ca5325d6 worker: Use Recover middleware to handle panics
recover from panics such as out-of-bounds array access & nil
pointer access, print a stack trace and return 5xx error
2021-10-06 17:04:52 +02:00
Diaa Sami
22f151df68 worker: Improve logging
Use logrus library for logging
Use appropriate log-level for different log statements
2021-10-06 17:04:52 +02:00
sanne
2f328b0e97 workers: Backwards compatible api.openshift.com spec compliance
The main changes are:
- Kind, Href, Id fields for every object returned
- Attach operationIds to each request, return it for errors
- Errors are predefined and queryable
2021-09-27 13:10:05 +01:00
sanne
4a057bf3d5 auth: OpenID/OAUth2 middleware
2 configurations for the listeners are now possible:
- enableJWT=false with client ssl auth
- enableJWT=true with https

Actual verification of the tokens is handled by
https://github.com/openshift-online/ocm-sdk-go.

An authentication handler is run as the top level handler, before any
routing is done. Routes which do not require authentication should be
listed as exceptions.

Authentication can be restricted using an ACL file which allows
filtering based on JWT claims. For more information see the inline
comments in ocm-sdk/authentication.

As an added quirk the `-v` flag for the osbuild-composer executable was
changed to `-verbose` to avoid flag collision with glog which declares
the `-v` flag in the package `init()` function. The ocm-sdk depends on
glog and pulls it in.
2021-09-04 02:48:52 +02:00
sanne
7a0ea5b244 worker: Remove identity filter
Partially reverts "0ea31c39d5"
2021-09-04 02:48:52 +02:00
Chloe Kaubisch
4c800f29a7 worker: add metrics
use prometheus to gather metrics
2021-07-23 21:54:28 +02:00
sanne
4385c39d66 worker: Introduce heartbeats
An occupied worker checks about every 15 seconds if it's current job was
cancelled. Use this to introduce a heartbeat mechanism, where if
composer hasn't heard from the worker in 2 minutes, the job times out
and is set to fail.
2021-07-08 21:14:38 +01:00
sanne
0fcb44e617 worker: Move job tokens to the queue itself
This removes state from the worker server, as it no longer contains the
list of running jobs. Instead only the queue knows if jobs are running
or not.
2021-07-08 21:14:38 +01:00
sanne
4f86b4fd45 worker: Use http.PostForm to post data
Avoid having to encode the data ourselves.
2021-06-23 10:33:22 +02:00
Achilleas Koutsou
1a3447ed38 kojiapi: include image type exports in Koji job args
Koji image request handling now reads the exports defined by each image
type. All APIs now support reading the exports defined by each image
type. The worker still falls back to "assembler" in case the call comes
from an older version of composer.
2021-06-18 14:02:09 +01:00
sanne
cad7f7ff63 worker: Add test for the worker oauth2 auth 2021-06-17 10:08:35 +02:00
Ondřej Budai
0a304f659d worker/client: pass arch explicitly
The API client guessed the arch, let's pass it explicitly so a caller
can specify it.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-06-17 10:08:35 +02:00
sanne
8fa822c02e worker: Return basepath depending on route 2021-06-17 10:08:35 +02:00
sanne
0ea31c39d5 worker: Add identity filter and client oauth support 2021-06-17 10:08:35 +02:00
Tomas Hozza
f7f064274a Tests: remove fedoratest and replace it with test_distro
fedoratest was yet another dummy distribution used by unit tests. After
the rework of test_distro, there is no reason to not use it as the only
distro implementation for testing purposes.

Remove fedoratest distro and replace it with test_distro in all affected
tests.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2021-05-14 15:43:00 +02:00
Achilleas Koutsou
2cce81093f osbuld-worker: call osbuild with --export flag
osbuild now supports using the `--export` flag (can be invoked multiple
times) to request the exporting of one or more artefacts.  Omitting it
causes the build job to export nothing.

The Koji API doesn't support the new image types (yet) so it simply uses
the "assembler" name, which is the final stage of the old (v1)
Manifests.
2021-03-17 18:12:17 +00:00
Achilleas Koutsou
8090621300 osbuild: rename package to osbuild1
Preparing for version 2 of the manifest schema, which will be
implemented in a separate package (osbuild2) alongside the original.
2021-03-17 18:12:17 +00:00
Tom Gundersen
9e2e009ac8 distro: introduce PackageSets
This replaces Packages() and BuildPackages() by returning a map of
package sets, the semantics of which is up to the distro to define.

They are meant to be depsolved and the result returned back as a
map to Manifest(), with the same keys.

No functional change.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2021-03-10 11:52:05 +00:00
Chloe Kaubisch
f091af55d8 cloudapi: add targetresults
Add the TargetResult struct to OSBuildJobResult. Include the 'options'
interface on TargetResult to contain target-specific information,
for example amiID and region from AWS. Expose 'options' on a status
call as an UploadStatus field. Add logic to support AWS within this
format, which can be used as a template for other targets.
2021-03-02 13:22:11 +01:00
Chloe Kaubisch
899d78f7e1
cloudapi: expose upload status
Expose a more detailed job status result - specifically, include upload status
alongside image status. Expand openapi.yml accordingly and add an UploadStatus
field to the OSBuildJobResult struct. At the moment, only represent the
"success" and "failure" states of UploadStatus - to differentiate between
"pending" and "running" would involve significant design decisions and should be
addressed in a separate commit.
2021-02-05 12:34:28 +01:00
Achilleas Koutsou
668fb003ef jobqueue: Replace JobArgs() with Job()
JobArgs() function replaced with more general Job() function that
returns all the parameters used to originally define a job during
Enqueue(). This new function enables access to the type of a job in the
queue, which wasn't available until now (except when Dequeueing).
2021-01-19 10:37:51 +01:00
Achilleas Koutsou
75a96bd99d test: Job args/manifests retrieval
1. Test retrieving the job arguments from the worker using an
   OSBuildJob.
   Very basic test with an empty manifest.
2. Test retrieving job manifests through the Koji API. Uses the existing
   TestCompose test. The returned manifests are empty for all cases.
2021-01-16 13:39:30 +01:00
Achilleas Koutsou
1c5f6810be worker: Add JobArgs() method
Wraps jobqueue.JobArgs() method and unmarshals data into the provided
concrete struct value. The struct should match the requested job type of
the given ID.
2021-01-16 13:39:30 +01:00
Lars Karlitski
cb894ccf68 jobqueue: remove testjobqueue
testjobqueue did not implement the JobQueue interface correctly (noted
in its package comment), making it impossible to write tests for
JobQueue itself.

Replace its use everywhere with fsjobqueue operating on a temporary
directory.
2021-01-12 12:19:25 +01:00
Ondřej Budai
973639d372 distro/rhel84: use a random uuid for XFS partition
Imagine this situation: You have a RHEL system booted from an image produced
by osbuild-composer. On this system, you want to use osbuild-composer to
create another image of RHEL.

However, there's currently something funny with partitions:

All RHEL images built by osbuild-composer contain a root xfs partition. The
interesting bit is that they all share the same xfs partition UUID. This might
sound like a good thing for reproducibility but it has a quirk.

The issue appears when osbuild runs the qemu assembler: it needs to mount all
partitions of the future image to copy the OS tree into it.

Imagine that osbuild-composer is running on a system booted from an imaged
produced by osbuild-composer. This means that its root xfs partition has this
uuid:

efe8afea-c0a8-45dc-8e6e-499279f6fa5d

When osbuild-composer builds an image on this system, it runs osbuild that
runs the qemu assembler at some point. As I said previously, it will mount
all partitions of the future image. That means that it will also try to
mount the root xfs partition with this uuid:

efe8afea-c0a8-45dc-8e6e-499279f6fa5d

Do you remember this one? Yeah, it's the same one as before. However, the xfs
kernel driver doesn't like that. It contains a global table[1] of all xfs
partitions that forbids to mount 2 xfs partitions with the same uuid.

I mean... uuids are meant to be unique, right?

This commit changes the way we build RHEL 8.4 images: Each one now has a
unique uuid. It's now literally a unique universally unique identifier. haha

[1]: a349e4c659/fs/xfs/xfs_mount.c (L51)
2020-12-15 16:43:39 +01:00
Ondřej Budai
e10a7f1ccc {koji,worker}/server: log errors returned from handlers
Previously, we had no clue what errors were catched by the default echo's
error handler. Thus, in the case of an error, we were basically blind. Let's
log all errors so we can investigate them later.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2020-12-02 08:52:27 +01:00