fedoratest was yet another dummy distribution used by unit tests. After
the rework of test_distro, there is no reason to not use it as the only
distro implementation for testing purposes.
Remove fedoratest distro and replace it with test_distro in all affected
tests.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
osbuild now supports using the `--export` flag (can be invoked multiple
times) to request the exporting of one or more artefacts. Omitting it
causes the build job to export nothing.
The Koji API doesn't support the new image types (yet) so it simply uses
the "assembler" name, which is the final stage of the old (v1)
Manifests.
This replaces Packages() and BuildPackages() by returning a map of
package sets, the semantics of which is up to the distro to define.
They are meant to be depsolved and the result returned back as a
map to Manifest(), with the same keys.
No functional change.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Add the TargetResult struct to OSBuildJobResult. Include the 'options'
interface on TargetResult to contain target-specific information,
for example amiID and region from AWS. Expose 'options' on a status
call as an UploadStatus field. Add logic to support AWS within this
format, which can be used as a template for other targets.
Expose a more detailed job status result - specifically, include upload status
alongside image status. Expand openapi.yml accordingly and add an UploadStatus
field to the OSBuildJobResult struct. At the moment, only represent the
"success" and "failure" states of UploadStatus - to differentiate between
"pending" and "running" would involve significant design decisions and should be
addressed in a separate commit.
JobArgs() function replaced with more general Job() function that
returns all the parameters used to originally define a job during
Enqueue(). This new function enables access to the type of a job in the
queue, which wasn't available until now (except when Dequeueing).
1. Test retrieving the job arguments from the worker using an
OSBuildJob.
Very basic test with an empty manifest.
2. Test retrieving job manifests through the Koji API. Uses the existing
TestCompose test. The returned manifests are empty for all cases.
Wraps jobqueue.JobArgs() method and unmarshals data into the provided
concrete struct value. The struct should match the requested job type of
the given ID.
testjobqueue did not implement the JobQueue interface correctly (noted
in its package comment), making it impossible to write tests for
JobQueue itself.
Replace its use everywhere with fsjobqueue operating on a temporary
directory.
Imagine this situation: You have a RHEL system booted from an image produced
by osbuild-composer. On this system, you want to use osbuild-composer to
create another image of RHEL.
However, there's currently something funny with partitions:
All RHEL images built by osbuild-composer contain a root xfs partition. The
interesting bit is that they all share the same xfs partition UUID. This might
sound like a good thing for reproducibility but it has a quirk.
The issue appears when osbuild runs the qemu assembler: it needs to mount all
partitions of the future image to copy the OS tree into it.
Imagine that osbuild-composer is running on a system booted from an imaged
produced by osbuild-composer. This means that its root xfs partition has this
uuid:
efe8afea-c0a8-45dc-8e6e-499279f6fa5d
When osbuild-composer builds an image on this system, it runs osbuild that
runs the qemu assembler at some point. As I said previously, it will mount
all partitions of the future image. That means that it will also try to
mount the root xfs partition with this uuid:
efe8afea-c0a8-45dc-8e6e-499279f6fa5d
Do you remember this one? Yeah, it's the same one as before. However, the xfs
kernel driver doesn't like that. It contains a global table[1] of all xfs
partitions that forbids to mount 2 xfs partitions with the same uuid.
I mean... uuids are meant to be unique, right?
This commit changes the way we build RHEL 8.4 images: Each one now has a
unique uuid. It's now literally a unique universally unique identifier. haha
[1]: a349e4c659/fs/xfs/xfs_mount.c (L51)
Previously, we had no clue what errors were catched by the default echo's
error handler. Thus, in the case of an error, we were basically blind. Let's
log all errors so we can investigate them later.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The previous code was smelling a bit (e.g. Server.server field) so I decided
to rewrite it in the style of the much nicer koji server.
Not a functional change.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Serializing an interface does not work, let us simply use the string
representation and treat the empty string as no error. This is
compatible with the current API in the success case, and fixes the
error case, which is currently broken.
Also extend the test matrix for the kojiapi to ensure that all the
different kinds of errors can be serialized correctly and leads to
the correct status being returned.
Fixes#1079 and #1080.
The status of a job may depend on the status of its dependenices,
as we do not repeat for instance the failed state in each dependent
job.
Return also the list of dependencies so these can be queried too.
Most of the worker API is now untyped, but keep Enqueu() typed to
ensure the job objects match the names in the queue. This means we
must add a version of Enqueue() for each job type we support.
We must special-case the treatment of architecture, to select the
correct remote worker for any job that requires a specific
architecture. For now this means any jobs that run osbuild.
The three new job types osbuild-koji, koji-init, and koji-finalize
allows the different tasks to be split appart and in particular for
there to be several builds on different architectures as part of a
given compose.
In addition to the arguments passed when scheduling a job, a job now
also takes the results of its dependencies as additional arguments. We
call these dynamic arguments for the lack of a better term.
The immediate use-case for this is to allow koji jobs to be split up
as follows:
- koji-init: Creates a koji build, and returns us a token.
- osbuild-koji: one job per architecture, depending on koji-init
having succeeded. Builds the image, and uploads it to koji,
returning metadata about the image produced.
- koji-finalize: uses the token from koji-init and the metadata
from osbuild-koji to import the build into koji if it succeeded
or mark it as failed if it failed.
Similarly to the recent changes to Dequeue(), let the caller unmarshal the
return JSON. This allows us to pass the result on without being able
to unmarshal it.
In follow-up patches, we will pass results of jobs to dependent jobs,
but the worker API does not know about the different job types, nor how
to unmarshal them.
Once a job has been enqueued, there is no way to query its dependencies.
This makes dequeue more symmetric to enqueue by returning the
dependencies that were passed to enqueue, allowing the caller to
query the dependencies and their results.
Signed-off-by: Tom Gundersen <teg@jklm.no>
The worker server was heavily tied to OSBuildJob(Result). Untie it so
that it can deal with different job types in the future.
This necessitates a change in the jobqueue: Dequeue() now returns the
job type, as well as job arguments as json.RawMessage. This is so that
the server can wait on multiple job types with different argument
types.
The weldr, composer, and koji APIs continue to use only "osbuild" jobs.
Move the fact that the worker is requesting jobs of type "osbuild" out
of the client library.
For one, require consumers to pass accepted job types to RequestJobs()
and allow querying for the job type with the new Type() function.
Also, make OSBuildArgs() and Update() generic, requiring to pass an
argument that matches the job type.
Workers reported status via an `osbuild.Result`, which only includes
osbuild output. Make it report OSBuildJobResult instead, which was meant
to be used for this purpose and is already used as the result type in
the jobqueue.
While at it, add any errors produced by targets into this struct, as
well as an overall success flag.
Note that this breaks older workers returning the result of an osbuild
job to a new composer. I think this is fine in this case, for two
reasons:
1. We don't support running different versions of the worker and
composer in the weldr API, and remote workers aren't widely used yet.
2. Both osbuild.Result and worker.OSBuildJobResult have a top-level
`Success` boolean. Thus, logs are lost in such cases, but the overall
status of the compose is not.
Add "image_name" and "stream_optimized" fields to the osbuild job as
replacement for the local target options. The former signifies the name
of the uploaded artifact and whether an artifact should be uploaded at
all (only weldr API). The latter will be deprecated at some point, when
osbuild itself can make streamoptimized vmdk images.
This change separates what have always been two distinct concepts:
artifacts that are reported back to the composer node (in practice
always running on the same machine), and upload targets to clouds and
such. Separating them makes it easier to add job types that only allow
one upload target while keeping artifacts.
Keep the local target around, so that jobs that are scheduled can still
be run after an upgrade.
The server hasn't used common.ImageBuildState to mark a job as
successful or failed for a long time. Instead, it's using the job's
return argument for that. (Jobs don't have a high-level concept of
failing).
Drop the check in the server, and always send "FINISHED" from the client
for backwards compatibility.
This state is specific to weldr. Previous commits removed it from the
other APIs, because they use different values.
Move the conversion into the weldr API.
Until now, all jobs were put as "osbuild" jobs into the job queue and
the worker API hard-coded sending an osbuild manifest and upload
targets.
Change the API to take a "type" and "args" keys, which are equivalent to
the job-queue's type and args. Workers continue to support only osbuild
jobs, but this makes other jobs possible in the future.
When remote worker socket was enabled, this was happening:
e := echo.New()
go func() {
e.Listener = listener1
e.Start("")
}()
e.Listener = listener2
e.Start("")
Yeah, this is a race condition. None of the echo's Start methods cannot safely
handle multiple listeners.
This commit fixes this issue by using Echo only as a router for standard
http.Server which handles multiple listeners in a non-racy way.
The worker API returns errors of the form:
{ "message": "..." }
Print the message of those errors when receiving an error on the client.
This adds an `Error` type to openapi.yml and marks all routes as
returning it on 4XX and 5XX.
Instead of sending a `token` to workers, send back to URLs:
1. "location": URL at which the job can be inspected (GET) and updated
(PATCH).
2. "artifact_location": URL at which artifacts should be uploaded to.
The actual URLs remain the same, but a client does not need to stitch
them together manually (except appending the artifact's name).
Unfortunately, the client code generated by `deepmap` does not lend
itself to this style of APIs. Use standard http.Client again, which is a
partial revert of 0962fbd30.
Don't give out job ids to workers, but `tokens`, which serve as an
indirection. This way, restarting composer won't confuse it when a stray
worker returns a result for a job that was still running. Also,
artifacts are only moved to the final location once a job finishes.
This change breaks backwards compatibility, but we're not yet promising
a stable worker API to anyone.
This drops the transition tests in server_test.go. These don't make much
sense anymore, because there's only one allowed transition, from running
to finished. They heavily relied on job slot ids, which are not easily
accessible with the `TestRoute` API. Overall, adjusting this seemed like
too much work for their benefit.
The code generator uses the `operationID` field to generate server
handlers, client functions, and types. Use simpler names to make the
generated code easier to read.
This kind of common base path is better set in the top-level
`server.url` field, so that it can be adjusted.
For now, drop it completely, as we already broke the consistency when
introducing the `/status` route.
This change breaks backwards compatibility, but we're not yet promising
a stable worker API to anyone.
Write an openapi spec for the worker API and use `deepmap/oapi-codegen`
to generate scaffolding for the server-side using the `labstack/echo`
server.
Incidentally, echo by default returns the errors in the same format that
worker API always has:
{ "message": "..." }
The API itself is unchanged to make this change easier to understand. It
will be changed to better suit our needs in future commits.
In the same way `osbuild.Manifest` is the input to the osbuild API,
`osbuild.Result` is the output. Move it to the `osbuild` package where
it belongs.
This is not a functional change.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Rather than Manifest() returning an osbuild.Manifest object, introduce a
new distro.Manifest object which represents it as an opaque, JSON
serializable object. This new type has the following properties:
1) its serialization is compatible with the input to osbuild,
2) any valid osbuild input can be deserialized into it, and
3) marshalling and unmarshaling to and from JSON is lossless.
This means that even as we change the subset of valid osbulid manifests
that we support, we can still load any previous state from disk, and it
will continue to work just as before, even though we can no longer
deserialize it into our internal notion of osbuild.Manifest.
This fixes the underlying problem of which #685 was a symptom.
Signed-off-by: Tom Gundersen <teg@jklm.no>