Commit graph

43 commits

Author SHA1 Message Date
Lars Karlitski
b5769add2c store: move queue out of the store
The store is responsible for two things: user state and the compose queue. This
is problematic, because the rcm API has slightly different semantics from weldr
and only used the queue part of the store. Also, the store is simply too
complex.

This commit splits the queue part out, using the new jobqueue package in both
the weldr and the rcm package. The queue is saved to a new directory `queue/`.

The weldr package now also has access to a worker server to enqueue and list
jobs. Its store continues to track composes, but the `QueueStatus` for each
compose (and image build) is deprecated. The field in `ImageBuild` is kept for
backwards compatibility for composes which finished before this change, but a
lot of code dealing with it in package compose is dropped.

store.PushCompose() is degraded to storing a new compose. It should probably be
renamed in the future. store.PopJob() is removed.

Job ids are now independent of compose ids. Because of that, the local
target gains ComposeId and ImageBuildId fields, because a worker cannot
infer those from a job anymore. This also necessitates a change in the
worker API: the job routes are changed to expect that instead of a
(compose id, image build id) pair. The route that accepts built images
keeps that pair, because it reports the image back to weldr.

worker.Server() now interacts with a job queue instead of the store. It gains
public functions that allow enqueuing an osbuild job and getting its status,
because only it knows about the specific argument and result types in the job
queue (OSBuildJob and OSBuildJobResult). One oddity remains: it needs to report
an uploaded image to weldr. Do this with a function that's passed in for now,
so that the dependency to the store can be dropped completely.

The rcm API drops its dependencies to package blueprint and store, because it
too interacts only with the worker server now.

Fixes #342
2020-05-08 14:53:00 +02:00
Ondřej Budai
6eb43c3d97 worker: add a support for uploads to azure
Everything else is already implemented, this commit just connects the bits
and pieces in worker.
2020-04-29 18:15:13 +02:00
Ondřej Budai
b916a88242 worker: fix passing the result from osbuild when it fails
I tried fixing this in 181128c5 and forgot to pass the right error in one
place. This commit fixes it.
2020-04-29 11:40:36 +02:00
Ondřej Budai
181128c5b9 worker: fix missing logs when osbuild fails
The commit 2435163f broke sending the logs to osbuild-composer. This was
partly because of unusual error handling in the RunOSBuild function.

This commit fixes that by creating a custom error and properly propagating
the result from it.
2020-04-27 19:36:22 +02:00
Lars Karlitski
ac40b0e73b jobqueue: rename to worker
This package does not contain an actual queue, but a server and client
implementation for osbuild's worker API. Name it accordingly.

The queue is in package `store` right now, but about to be split off.
This rename makes the `jobqueue` name free for that effort.
2020-04-16 01:02:16 +02:00
Lars Karlitski
2435163fc9 worker: move running osbuild into separate function
Setting up a command to run is quite involved. Separate that from the
logic of running it.
2020-04-06 12:11:54 +02:00
Lars Karlitski
1ece08414c jobqueue: move Job.Run() to the worker
This makes the jobqueue package independent of forking osbuild, the
choices for which (exact invocation, location of the cache directory)
should be made in the worker.
2020-04-06 12:11:54 +02:00
Lars Karlitski
d3b9a3515d worker: inline handleJob()
It's a small function that's only called once.
2020-04-06 12:11:54 +02:00
Lars Karlitski
db5dd1ee2c worker: remove redundant UpdateJob() call
A job is already set to be running when it is returned from the API (see
Store.PopJob()).
2020-04-06 12:11:54 +02:00
Lars Karlitski
1f06d78362 jobqueue: rename ID to ComposeID in job structs
It's not an id of the job, but the compose id.
2020-04-06 12:11:54 +02:00
Lars Karlitski
3b5d5a73d3 worker: drop default port
We require passing the address from the unit file. Do the same for the
socket, using host:port syntax.

Overriding the port was broken before, because we unconditionally
appended ":8700" to every address.
2020-03-25 14:05:44 +01:00
Lars Karlitski
f8982f4a1a worker: don't hard code path to unix domain socket
Introduce a mandatory argument `address`, which is interpreted as a path
to a unix socket when `-unix` is given or a network address otherwise.

Move the default path to the service file.

Add a more useful usage message when passing `-help` or no arguments.
2020-03-25 14:05:44 +01:00
Lars Karlitski
b5432e78b9 worker: move ComposerClient to jobqueue package
This moves the client code into the same package as the server code,
which makes it easier to change (and version) the two in sync. Also, it
will allow to make some structs private to the jobqueue package and to
test `Client`.

Also rename it to jobqueue.Client.
2020-03-25 14:05:44 +01:00
Lars Karlitski
cb4421b69f worker: remoteAddress → address 2020-03-25 14:05:44 +01:00
Lars Karlitski
94183d14a8 worker: split NewClient()
Use the default dialing functions for tcp connections and set the tls
config on the transport directly. This makes the code easier to follow,
because the only special case is overriding the DialContext() for unix
connections.
2020-03-25 14:05:44 +01:00
Lars Karlitski
845ba6e8e5 worker: don't hard code upload URL
This doesn't work with remote workers.
2020-03-25 14:05:44 +01:00
Lars Karlitski
9e71df234a worker: load tls certificates once on startup 2020-03-25 14:05:44 +01:00
Lars Karlitski
16cd243300 worker: set remoteAddress once on startup 2020-03-25 14:05:44 +01:00
Lars Karlitski
ee752b0ab8 tree-wide: panic when json marshalling fails
According to the new guidelines in docs/errors.md.

Note that this does not include code that marshals to a writer that
might fail (when a connection drops, for example).
2020-03-25 10:22:16 +01:00
Ondřej Budai
d7cbc22da4 lint: fix unhandled errors 2020-03-02 14:28:55 +01:00
Ondřej Budai
3032abfdbe lint: fix gosimple/S1028 errors 2020-03-02 14:28:55 +01:00
Ondřej Budai
820d23fd9d Add tcp and tls support for worker and job API
There's a usecase for running workers at a different machine than
the composer. For example when there's need for making images for
architecture different then the composer is running at. Although osbuild has
some kind of support for cross-architecture builds, we still consider it
as experimental, not-yet-production-ready feature.

This commit adds a support to composer and worker to communicate using TCP.
To ensure safe communication through the wild worlds of Internet, TLS is not
only supported but even required when using TCP. Both server and client
TLS authentication are required. This means both sides must have their own
private key/certificate pair and both certificates must be signed using one
certificate authority. Examples how to generate all this fancy crypto stuff
can be found in Makefile.

Changes on the composer side:
When osbuild-remote-worker.socket is started before osbuild-composer.service,
osbuild-composer also serves jobqueue API on this socket. The unix domain
socket is not affected by this changes - it is enabled at all times
independently on the remote one. The osbuild-remote-worker.socket listens
by default on TCP port 8700.

When running the composer with remote worker socket enabled, the following
files are required:
- /etc/osbuild-composer/ca-crt.pem     (CA certificate)
- /etc/osbuild-composer/composer-key.pem (composer private key)
- /etc/osbuild-composer/composer-crt.pem (composer certificate)

Changes on the worker side:
osbuild-worker has now --remote argument taking the address to a composer
instance. When present, the worker will try to establish TLS secured TCP
connection with the composer. When not present, the worker will use
the unix domain socket method. The unit template file osbuild-remote-worker
was added to simplify the spawning of workers. For example

systemctl start osbuild-remote-worker@example.com

starts a worker which will attempt to connect to the composer instance
running on the address example.com.

When running the worker with --remote argument, the following files are
required:
- /etc/osbuild-composer/ca-crt.pem     (CA certificate)
- /etc/osbuild-composer/worker-key.pem (worker private key)
- /etc/osbuild-composer/worker-crt.pem (worker certificate)

By default osbuild-composer.service will always spawn one local worker.
If you don't want it you need to mask the default worker unit by:
systemctl mask osbuild-worker@1.service

Closing remarks:
Remember that both composer and worker certificate must be signed by
the same CA!
2020-02-20 13:47:59 +01:00
Ondřej Budai
6902f730cb worker: upload local target image using jobqueue api
Prior this commit local target copied the image from a worker to a composer
using cp(1) command. This prevented the local target to work on remote
workers.

This commit switches the local target implementation to using the jobqueue
API introduced in the previous commit. I had some concerns about speed
of this solution (imho nothing can beat pure cp(1) implementation) but
ad hoc sanity tests showed the copying of the image using the jobqueue API
when running the worker on the same machine as the composer is still
more or less instant.
2020-02-14 11:53:38 +01:00
Ondřej Budai
b64bbaa0bb api/jobqueue: move build id to url
Imho it makes more sense from REST perspective. Also, in the future there
will be ROUTE for uploading image to image build. As it's not a good idea
transport file inside JSON, all the parameters (compose id and image
build id) need to be inside the URL. Therefore for the sake of consistency,
all these routes should have compose id and image build id in the URL.

There is another solution to embedding multiple values inside http body
which allows file transport - multipart/form-data. I think using form-data
is worth when doing more complex stuff, for our usecase transporting all
the metadata in the URL is more appropriate solution.
2020-02-14 11:53:38 +01:00
Ondřej Budai
cc00e0cdc9 drop the Compose.Image field
Everything that this field contained can be computed in another way:

- path: just lookup the local target and read the path from there
- mime: can be derived from distribution and compose output type
- size: can be derived from the path

Therefore it imho doesn't make much sense to store these information multiple
times.
2020-02-14 11:53:38 +01:00
Martin Sehnoutka
05b1093170 osbuild-pipeline: use the new types 2020-02-12 11:17:26 +01:00
Ondřej Budai
8781d41da6 worker: normalize Job.Run() return types 2020-02-05 01:35:50 +01:00
Ondřej Budai
0d4479bbcd worker: save result.json in the composer instead of the worker
In the future remote workers will be introduced. Obviously, the remote worker
cannot support the local target. Unfortunately, the current implementation of
storing the osbuild result is dependant on it.

This commit moves the responsibility of storing osbuild result to the
composer process instead of the worker process. The result is transferred from
a worker to a composer using extended HTTP API.
2020-02-05 01:35:50 +01:00
Lars Karlitski
df99e8b359 worker: log errors that are not returned to composer 2019-12-15 22:05:31 +01:00
Lars Karlitski
5bce59b979 worker: do not get the distro from the host
Add a `Distro` field to both `Job` structures and send that to the
worker.
2019-12-15 22:05:31 +01:00
Lars Karlitski
dff8cb56be worker: don't call log.Fatalf() unconditionally
Oops.
2019-12-15 22:05:31 +01:00
Tom Gundersen
118b185fdd osbuild-{composer/worker}: exit cleanly
Only panic on compile-time errors (e.g., built for unsupported
architecture). Otherwise, use log.Fatalf(), which is equivalent to
printing and exiting with return code 1. Only ever do this from
main(), in all other cases pass on the error object.

This is mostly relevant when the server disconects, in which case
we'll get EOF, and will now restart cleanly instead of panicing.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-12-11 15:23:24 +01:00
Ondřej Budai
f89a9671be store: add image struct into compose struct
As a part of f4991cb1 ComposeEntry struct was removed from store package.
This change made sense because this struct is connected more with API than
with store - store uses its own Compose struct. In addition, converters
between Compose and ComposeEntry were added. Unfortunately, ComposeEntry
contains ImageSize which was not stored in Compose but retrieved from store
using GetImage method. This made those converters dependent on the store,
which was messy.

To solve this issue this commit adds image struct into Compose struct.
The content of image struct is generated on the worker side - when the worker
sets the compose status to FINISHED, it also sends Image struct with detailed
information about the result.
2019-12-05 09:48:21 +01:00
Lars Karlitski
5dad3bfc8e worker: pass build environment to osbuild
Detect it from the host using the distro package.
2019-11-29 00:46:05 +01:00
Tom Gundersen
caff96bd4f job/run: never panic on failed job
Return the error code of the osbuild run, and an array of errors,
one for each target provided. If a target fails, all other targets
are still attempted.

If either osbuild or one of the targets retursn an error, the worker
notifies osbuild-composer that the job failed.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-11-28 05:56:11 +01:00
Tom Gundersen
aa404dcb99 worker: move Job type to the jobqueue package
The main purpose of this is to share the structs between the server
and the client, and let the compiler ensure that our marshaling and
unmarshaling matches.

In the future we also want to make it easier to write unittests for
this code.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-29 16:08:54 +01:00
Ondřej Budai
15b82a15d2 osbuild-composer: Rename module to github.com/osbuild/osbuild-composer
This should be the best practice according to other popular go projects:
- https://github.com/prometheus/prometheus
- https://github.com/syncthing/syncthing
- https://github.com/drone/drone
- https://github.com/hashicorp/terraform

Also, this change fixes go get command (it currently fails due to bad package
name).
2019-10-08 21:44:57 +02:00
Tom Gundersen
0880014edf jobqueue: cleanup API a bit and unify the two job stores
Let the store in weldr be the only one that keeps state, and push
updates directly there. This fixes a bug where there was an ID mismatch.

Change the API to not let the caller pick the UUID, but provide it
in the response. Use the same UUID as is used to identify composes,
this makes it simpler to trace what is going on.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-10-07 10:37:43 +02:00
Tom Gundersen
96f3cbf655 weldr/store/compose: store information exposed in the API
Use the exact same status strings as is used in the API,
making it clearer that they are the same (and avoiding any
translation). Remember the creation/start/finish timestamps.
And store the output type.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-29 14:51:22 +02:00
Tom Gundersen
d231694bae worker/target/local: copy to the right location
We were copying the containing directory from our object store, we
only want the contents.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-28 17:49:07 +02:00
Tom Gundersen
7625d26ff5 pipeline/target: implement as variant types
Go doesn't really do variants, so we must somehow emulate it. The
json objects we use are essentially tagged unions, with a `name`
field in reverse domain name notation identifying the type and a
type specific 'options' object.

In Go we represent this by having an BarOptions interface, which
implements a private method `isBarOptions()`, making sure that only
types in the same package are able to implement it. Each type FooBar
that should belong to the variant implements the interface, and a
constructor `NewFooBar(options *FooBarOptions) *Bar` that makes sure
the `name` field is set correctly.

This would be enough to represent our types and marshal them into
JSON, but unmarshalling would not work (json does not know about
our tags, so would not know what concrete types to demarshal to).
We therefore must also implement the Unmarshall interface for Bar,
to select the right types for the Options field.

We implement his logic for Target, Stage and Assembler. A handful
of concrete types are also implemented, matching what osbuild
supports.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-09-28 17:49:07 +02:00
Lars Karlitski
87bcd7f9d3 worker: run osbuild 2019-09-27 17:42:52 +02:00
Lars Karlitski
cfe89eaed5 Add simple osbuild-worker
It doesn't actually build anything yet, but talks to the queue API.
2019-09-26 23:58:03 +02:00