This state is specific to weldr. Previous commits removed it from the
other APIs, because they use different values.
Move the conversion into the weldr API.
Until now, all jobs were put as "osbuild" jobs into the job queue and
the worker API hard-coded sending an osbuild manifest and upload
targets.
Change the API to take a "type" and "args" keys, which are equivalent to
the job-queue's type and args. Workers continue to support only osbuild
jobs, but this makes other jobs possible in the future.
When remote worker socket was enabled, this was happening:
e := echo.New()
go func() {
e.Listener = listener1
e.Start("")
}()
e.Listener = listener2
e.Start("")
Yeah, this is a race condition. None of the echo's Start methods cannot safely
handle multiple listeners.
This commit fixes this issue by using Echo only as a router for standard
http.Server which handles multiple listeners in a non-racy way.
Instead of sending a `token` to workers, send back to URLs:
1. "location": URL at which the job can be inspected (GET) and updated
(PATCH).
2. "artifact_location": URL at which artifacts should be uploaded to.
The actual URLs remain the same, but a client does not need to stitch
them together manually (except appending the artifact's name).
Unfortunately, the client code generated by `deepmap` does not lend
itself to this style of APIs. Use standard http.Client again, which is a
partial revert of 0962fbd30.
Don't give out job ids to workers, but `tokens`, which serve as an
indirection. This way, restarting composer won't confuse it when a stray
worker returns a result for a job that was still running. Also,
artifacts are only moved to the final location once a job finishes.
This change breaks backwards compatibility, but we're not yet promising
a stable worker API to anyone.
This drops the transition tests in server_test.go. These don't make much
sense anymore, because there's only one allowed transition, from running
to finished. They heavily relied on job slot ids, which are not easily
accessible with the `TestRoute` API. Overall, adjusting this seemed like
too much work for their benefit.
The code generator uses the `operationID` field to generate server
handlers, client functions, and types. Use simpler names to make the
generated code easier to read.
Write an openapi spec for the worker API and use `deepmap/oapi-codegen`
to generate scaffolding for the server-side using the `labstack/echo`
server.
Incidentally, echo by default returns the errors in the same format that
worker API always has:
{ "message": "..." }
The API itself is unchanged to make this change easier to understand. It
will be changed to better suit our needs in future commits.
Rather than Manifest() returning an osbuild.Manifest object, introduce a
new distro.Manifest object which represents it as an opaque, JSON
serializable object. This new type has the following properties:
1) its serialization is compatible with the input to osbuild,
2) any valid osbuild input can be deserialized into it, and
3) marshalling and unmarshaling to and from JSON is lossless.
This means that even as we change the subset of valid osbulid manifests
that we support, we can still load any previous state from disk, and it
will continue to work just as before, even though we can no longer
deserialize it into our internal notion of osbuild.Manifest.
This fixes the underlying problem of which #685 was a symptom.
Signed-off-by: Tom Gundersen <teg@jklm.no>
There are times where it would be good to monitor that osbuild-composer
is up and running. Add a very simple status check that always returns
200/OK. This can be expanded later to verify that other parts of
osbuild-composer are working properly.
Signed-off-by: Major Hayden <major@redhat.com>
The `jobs/:job_id/builds/:build_id/image` route was awkward: the
`:jobid` was actually weldr's compose id and `:build_id` was always `0`.
Change it to `jobs/:job_id/artifacts/:name`, where `:job_id` is now a
job id, and `:name` is the name of the artifact to upload. In the
future, it could support uploading more than one artifact.
This allows removing outputs from `store`, which is now back to being a
pure JSON-store. Take care that `weldr` returns (and deletes) images
from the new (or for backwards compatibility, the old) location.
The `org.osbuild.local` target continues to exist as a marker for the
worker to know whether it should upload artifacts.
Let's speak why I merged JobResult and JobStatus together:
Both methods actually called jobqueue.JobStatus(), so performance-wise
there's no real difference and it feels to me that it makes the code simpler:
You don't have to decide which method to call, you just get all the data
about a job in one call. We could split those again when we see some perf
issues with retrieving logs on each status check but I don't think we should
optimize prematurely. Let's leave some work for the future us.
Personally, I don't like methods returning too much values. I think it's
easy to assign the values in an incorrect order and thus interpreting them
in a wrong way. Also, the following commits will make JobStatus() return
also the JobResult, which would make the number of returned values even
higher.
Not a functional change.
The enum is redundant information that can be deduced from the job's
times: queuedAt, startedAt, and finishedAt. Not having it reduces the
potential for inconsistent state.
The store is responsible for two things: user state and the compose queue. This
is problematic, because the rcm API has slightly different semantics from weldr
and only used the queue part of the store. Also, the store is simply too
complex.
This commit splits the queue part out, using the new jobqueue package in both
the weldr and the rcm package. The queue is saved to a new directory `queue/`.
The weldr package now also has access to a worker server to enqueue and list
jobs. Its store continues to track composes, but the `QueueStatus` for each
compose (and image build) is deprecated. The field in `ImageBuild` is kept for
backwards compatibility for composes which finished before this change, but a
lot of code dealing with it in package compose is dropped.
store.PushCompose() is degraded to storing a new compose. It should probably be
renamed in the future. store.PopJob() is removed.
Job ids are now independent of compose ids. Because of that, the local
target gains ComposeId and ImageBuildId fields, because a worker cannot
infer those from a job anymore. This also necessitates a change in the
worker API: the job routes are changed to expect that instead of a
(compose id, image build id) pair. The route that accepts built images
keeps that pair, because it reports the image back to weldr.
worker.Server() now interacts with a job queue instead of the store. It gains
public functions that allow enqueuing an osbuild job and getting its status,
because only it knows about the specific argument and result types in the job
queue (OSBuildJob and OSBuildJobResult). One oddity remains: it needs to report
an uploaded image to weldr. Do this with a function that's passed in for now,
so that the dependency to the store can be dropped completely.
The rcm API drops its dependencies to package blueprint and store, because it
too interacts only with the worker server now.
Fixes#342