We must special-case the treatment of architecture, to select the
correct remote worker for any job that requires a specific
architecture. For now this means any jobs that run osbuild.
The three new job types osbuild-koji, koji-init, and koji-finalize
allows the different tasks to be split appart and in particular for
there to be several builds on different architectures as part of a
given compose.
In addition to the arguments passed when scheduling a job, a job now
also takes the results of its dependencies as additional arguments. We
call these dynamic arguments for the lack of a better term.
The immediate use-case for this is to allow koji jobs to be split up
as follows:
- koji-init: Creates a koji build, and returns us a token.
- osbuild-koji: one job per architecture, depending on koji-init
having succeeded. Builds the image, and uploads it to koji,
returning metadata about the image produced.
- koji-finalize: uses the token from koji-init and the metadata
from osbuild-koji to import the build into koji if it succeeded
or mark it as failed if it failed.
Similarly to the recent changes to Dequeue(), let the caller unmarshal the
return JSON. This allows us to pass the result on without being able
to unmarshal it.
In follow-up patches, we will pass results of jobs to dependent jobs,
but the worker API does not know about the different job types, nor how
to unmarshal them.
Once a job has been enqueued, there is no way to query its dependencies.
This makes dequeue more symmetric to enqueue by returning the
dependencies that were passed to enqueue, allowing the caller to
query the dependencies and their results.
Signed-off-by: Tom Gundersen <teg@jklm.no>
While dependencies are purely internal, sorting and pruning them is a
reasonable optimization. However, we wish to expose them in follow-up
commits and then we want them to remain unchanged from the input.
Nothing in the internal logic seems to rely on the fact the dependencies
were sorted.
Signed-off-by: Tom Gundersen <teg@jklm.no>
The length of these is not predictable. It depends on the shortest
unique prefix in the repository and git configuration.
Just use the full one, which also makes it easier to copy the id from
`git log` or GitHub.
Change the repository path on S3 to a more predictable one. We really
only need the name of the project (static osbuild-composer for this
repository), the name of the distro (use the same as osbuild-composer's
API for consistency) and the commit SHA.
In particular, drop the PR number / branch name. Also don't remove the
dots from version numbers. All places we're using them in (paths and
URLs) support dots.
For example, osbuild composer commit xxxxxxx for fedora-33 on x86_64
will result in this URL:
osbuild-composer/fedora-33/x86_64/xxxxxxx
Jenkins has been configured to use the latest commit on a pull request
(instead of merging to master) for a long time now. Rename the variable
to reflect that.
This commit does several things:
1) Changes the Fedora 33 repos in the test case generator from development
to release ones.
2) Fixes format-request-map.json so we can generate fedora-iot-commit
"images".
3) Regenerates all the cases.
If the ostree test was run on an unsupported distro, it failed but with a
very weird error message. This commit makes the test fail fast and with a
nice message.
The downloaded image may not fit inside tmpfs, especially when testing
on a constrained VM. This commit makes the test script use a different
temporary directory while handling the possibly big image.
Remove both the package osbuild-composer-koji, and the only file it
shipped: osbuild-composer-koji.socket.
It's been deprecated since 835b556, but the backwards-compatible
solution in that commit never worked, because osbuild-composer only
checks for "osbuild-composer-api.socket" when starting up.
Since this has been meant to be deprecated for a while, just remove it
outright.
Add an "Obsoletes:" for the package, so that it gets uninstalled on
existing systems.
Add an API route that returns logs for a specific compose.
For now, this contains the result of the job, in JSON. The idea is to
put more and more of this information into structured APIs. This is a
first step to make logs available at all.
Amend koji-compose.py to check that the route exist and contains as many
"image_logs" as images that were requested (currently always 1).
Based on a patch by Chloe Kaubisch <chloe.kaubisch@gmail.com>.
The integration tests are leaving the composes (which include images) in
osbuild-composer. This can lead to exhausting the disk space we have available
on our tiny testing machines. This commit adds a removal of the composes
after each integration test is finished. This issue is not present in koji.sh
and api.sh as they use different osbuild-composer APIs that doesn't use the
artifact feature.
This issue occurred when I worked on enabling the Fedora 33 tests, see:
https://osbuildci.cloud.paas.psi.redhat.com/blue/organizations/jenkins/osbuild%2Fosbuild-composer/detail/PR-1014/23/pipeline
We claim to have self-contained test cases, but the base_tests.sh script
still requires the WORKSPACE environment variable to be set outside of
the script, which is what Jenkins does.
This patch replaces WORKSPACE with a temporary directory and modifies
Jenkinsfile to use it when collecting logs.
We have several repository definitions across the tests which is quite messy.
This commit switches the Koji test to use the "central" repository configs defined in test/data/repositories/
The worker server was heavily tied to OSBuildJob(Result). Untie it so
that it can deal with different job types in the future.
This necessitates a change in the jobqueue: Dequeue() now returns the
job type, as well as job arguments as json.RawMessage. This is so that
the server can wait on multiple job types with different argument
types.
The weldr, composer, and koji APIs continue to use only "osbuild" jobs.
Introduce JobImplementation and turn the current RunJob() into
OSBuildJobImpl. Make main() select a job impl based on job type.
This is in preparation to add additional impls.
Move the fact that the worker is requesting jobs of type "osbuild" out
of the client library.
For one, require consumers to pass accepted job types to RequestJobs()
and allow querying for the job type with the new Type() function.
Also, make OSBuildArgs() and Update() generic, requiring to pass an
argument that matches the job type.
Now, main() does not deal with OSBuildJobResult anymore, and RunJob()
doesn't return it. This means we can add more job types (i.e., different
RunJob()s) now.
WatchJob() regularly checks if a job was canceled in a goroutine. It
does so by accessing composer's `/jobs/{token}` route. However, once the
main goroutine marks the job as done (by sending PATCH to that same
route), the `token` is no longer valid and thus the route not accessible
anymore.
main() does cancel the goroutine running WatchJob, but it's not
guaranteed that it gets scheduled in time to actually stop watching the
job.
Thus, don't cancel the job when fetching the `/jobs/{token}` fails. This
means that it won't cancel the job anymore when the connection to
composer goes down.
Also, we will be able to move job.Update() into RunJob().
Workers reported status via an `osbuild.Result`, which only includes
osbuild output. Make it report OSBuildJobResult instead, which was meant
to be used for this purpose and is already used as the result type in
the jobqueue.
While at it, add any errors produced by targets into this struct, as
well as an overall success flag.
Note that this breaks older workers returning the result of an osbuild
job to a new composer. I think this is fine in this case, for two
reasons:
1. We don't support running different versions of the worker and
composer in the weldr API, and remote workers aren't widely used yet.
2. Both osbuild.Result and worker.OSBuildJobResult have a top-level
`Success` boolean. Thus, logs are lost in such cases, but the overall
status of the compose is not.
This function is almost the same as the koji uploader, except that it
calls `CGFailBuild` instead of `CGImport` at the end.
Don't exit early from RunJob() when the job failed. Instead, go through
all the uploaders anyway. All the others don't do anything when the job
fails, but now we have the chance to do the necessary `CGFailBuild` call
for koji.
This moves more logic from main() into RunJob(), so that we can support
different job kinds in the future.
Add "image_name" and "stream_optimized" fields to the osbuild job as
replacement for the local target options. The former signifies the name
of the uploaded artifact and whether an artifact should be uploaded at
all (only weldr API). The latter will be deprecated at some point, when
osbuild itself can make streamoptimized vmdk images.
This change separates what have always been two distinct concepts:
artifacts that are reported back to the composer node (in practice
always running on the same machine), and upload targets to clouds and
such. Separating them makes it easier to add job types that only allow
one upload target while keeping artifacts.
Keep the local target around, so that jobs that are scheduled can still
be run after an upgrade.
The server hasn't used common.ImageBuildState to mark a job as
successful or failed for a long time. Instead, it's using the job's
return argument for that. (Jobs don't have a high-level concept of
failing).
Drop the check in the server, and always send "FINISHED" from the client
for backwards compatibility.
osbuild reports failing builds in two ways: it sets the "success" field
in its output to `false` and it returns with a non-zero exit status. The
worker used both, returning an `OSBuildError` when osbuild return
non-zero, but also forwarding the resulting object with the "success"
field.
Change this to only use the "success" field and ignore the return value.
The latter is useful for people running osbuild in a terminal or script,
but is redundant for this use-case.
This makes error reporting more consistent: `RunOSBuild` only returns an
error when *running* osbuild failed, not when the build fails.
ComposeState is only used by the weldr API.
Drop the JSON marshaller and unmarshaller, because ComposeState is not
used in an JSON-exported field anymore.
This state is specific to weldr. Previous commits removed it from the
other APIs, because they use different values.
Move the conversion into the weldr API.
This is similar to the previous commit, which did this change in
package cloudapi.
Use constants instead of string literals for compose status, and derive
the status from worker.JobStatus directly, instead of via common.State.
Don't use common.State anymore, because it has different values from
what's defined in openapi.yml. It makes sense to have these strings
defined in the same package as the spec — ideally, the code generator
would make them for us.
While at it, add a "running" status.
Fix the api.sh test to use these new statuses. Thanks to Ondřej Budai
for an additional fix there.
We don't install any packages in test cases anymore, therefore we don't need
to install EPEL there.
A slightly different explanation:
osbuild-composer-tests depends on packages from EPEL on RHEL. Therefore, you
cannot run the test cases without EPEL installed. Therefore, there's no
point in installing EPEL there.