Serializing an interface does not work, let us simply use the string
representation and treat the empty string as no error. This is
compatible with the current API in the success case, and fixes the
error case, which is currently broken.
Also extend the test matrix for the kojiapi to ensure that all the
different kinds of errors can be serialized correctly and leads to
the correct status being returned.
Fixes#1079 and #1080.
Soon, we want to begin tagging the jobs with the name of its submitter.
The simplest way to add a tag to a job is to put it into its type string.
However, as we don't know (and don't want to know) the submitters' names when
osbuild-composer is initialized, we need to be able to push arbitrary job
types into the jobqueue.
This commit therefore lifts the restriction that a jobqueue accepts only
a predefined set of job types. Now, jobqueue clients can push jobs of
arbitrary names.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Now that all interaciton with the koji API happens in the workers
we can drop koji configuration from composer itself. This means
that composer no longer needs to be provisioned with kerberos
credentials, and does not need to know about which koji servers
the workers support.
This is no longer returned when creating a compose, as it is obtained
asynchronously.
The TaskID is still returned, and is always set to 0. This is not right,
and should either be fixed or dropped. The caller should know the TaskID
(if they have one), so this is redundant and currently unused.
The status of a job may depend on the status of its dependenices,
as we do not repeat for instance the failed state in each dependent
job.
Return also the list of dependencies so these can be queried too.
This removes the restriction of only having a single build per compose
and uses the new job types to schedule the broken-appart build.
A small change in behavior is introduced: the koji build ID is not
known when the call to `compose` returns, so it is always set to
`0`. In the future we should remove this from the API, and instead
rely on the status call to return this information, when it is
known.
The status route will be updated in follow-up commits to reflect the
changes introduced here.
Most of the worker API is now untyped, but keep Enqueu() typed to
ensure the job objects match the names in the queue. This means we
must add a version of Enqueue() for each job type we support.
We must special-case the treatment of architecture, to select the
correct remote worker for any job that requires a specific
architecture. For now this means any jobs that run osbuild.
The three new job types osbuild-koji, koji-init, and koji-finalize
allows the different tasks to be split appart and in particular for
there to be several builds on different architectures as part of a
given compose.
In addition to the arguments passed when scheduling a job, a job now
also takes the results of its dependencies as additional arguments. We
call these dynamic arguments for the lack of a better term.
The immediate use-case for this is to allow koji jobs to be split up
as follows:
- koji-init: Creates a koji build, and returns us a token.
- osbuild-koji: one job per architecture, depending on koji-init
having succeeded. Builds the image, and uploads it to koji,
returning metadata about the image produced.
- koji-finalize: uses the token from koji-init and the metadata
from osbuild-koji to import the build into koji if it succeeded
or mark it as failed if it failed.
Similarly to the recent changes to Dequeue(), let the caller unmarshal the
return JSON. This allows us to pass the result on without being able
to unmarshal it.
In follow-up patches, we will pass results of jobs to dependent jobs,
but the worker API does not know about the different job types, nor how
to unmarshal them.
Once a job has been enqueued, there is no way to query its dependencies.
This makes dequeue more symmetric to enqueue by returning the
dependencies that were passed to enqueue, allowing the caller to
query the dependencies and their results.
Signed-off-by: Tom Gundersen <teg@jklm.no>
While dependencies are purely internal, sorting and pruning them is a
reasonable optimization. However, we wish to expose them in follow-up
commits and then we want them to remain unchanged from the input.
Nothing in the internal logic seems to rely on the fact the dependencies
were sorted.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Add an API route that returns logs for a specific compose.
For now, this contains the result of the job, in JSON. The idea is to
put more and more of this information into structured APIs. This is a
first step to make logs available at all.
Amend koji-compose.py to check that the route exist and contains as many
"image_logs" as images that were requested (currently always 1).
Based on a patch by Chloe Kaubisch <chloe.kaubisch@gmail.com>.
The worker server was heavily tied to OSBuildJob(Result). Untie it so
that it can deal with different job types in the future.
This necessitates a change in the jobqueue: Dequeue() now returns the
job type, as well as job arguments as json.RawMessage. This is so that
the server can wait on multiple job types with different argument
types.
The weldr, composer, and koji APIs continue to use only "osbuild" jobs.
Move the fact that the worker is requesting jobs of type "osbuild" out
of the client library.
For one, require consumers to pass accepted job types to RequestJobs()
and allow querying for the job type with the new Type() function.
Also, make OSBuildArgs() and Update() generic, requiring to pass an
argument that matches the job type.
Workers reported status via an `osbuild.Result`, which only includes
osbuild output. Make it report OSBuildJobResult instead, which was meant
to be used for this purpose and is already used as the result type in
the jobqueue.
While at it, add any errors produced by targets into this struct, as
well as an overall success flag.
Note that this breaks older workers returning the result of an osbuild
job to a new composer. I think this is fine in this case, for two
reasons:
1. We don't support running different versions of the worker and
composer in the weldr API, and remote workers aren't widely used yet.
2. Both osbuild.Result and worker.OSBuildJobResult have a top-level
`Success` boolean. Thus, logs are lost in such cases, but the overall
status of the compose is not.
Add "image_name" and "stream_optimized" fields to the osbuild job as
replacement for the local target options. The former signifies the name
of the uploaded artifact and whether an artifact should be uploaded at
all (only weldr API). The latter will be deprecated at some point, when
osbuild itself can make streamoptimized vmdk images.
This change separates what have always been two distinct concepts:
artifacts that are reported back to the composer node (in practice
always running on the same machine), and upload targets to clouds and
such. Separating them makes it easier to add job types that only allow
one upload target while keeping artifacts.
Keep the local target around, so that jobs that are scheduled can still
be run after an upgrade.
The server hasn't used common.ImageBuildState to mark a job as
successful or failed for a long time. Instead, it's using the job's
return argument for that. (Jobs don't have a high-level concept of
failing).
Drop the check in the server, and always send "FINISHED" from the client
for backwards compatibility.
ComposeState is only used by the weldr API.
Drop the JSON marshaller and unmarshaller, because ComposeState is not
used in an JSON-exported field anymore.
This state is specific to weldr. Previous commits removed it from the
other APIs, because they use different values.
Move the conversion into the weldr API.
This is similar to the previous commit, which did this change in
package cloudapi.
Use constants instead of string literals for compose status, and derive
the status from worker.JobStatus directly, instead of via common.State.
Don't use common.State anymore, because it has different values from
what's defined in openapi.yml. It makes sense to have these strings
defined in the same package as the spec — ideally, the code generator
would make them for us.
While at it, add a "running" status.
Fix the api.sh test to use these new statuses. Thanks to Ondřej Budai
for an additional fix there.
The Koji test in Github actions was always a bit quick and dirty solution.
I think it's much nicer solution to run it on Schutzbot.
Therefore, this commit moves the koji_test.go to a new osbuild-koji-tests
executable. This new test isn't run in the base test suite as one would
anticipate but inside the koji.sh test. This is needed because
osbuild-koji-tests requires a running koji instance. This might change
in the future but I think it works for now.
This was introduced in osbuild 23, so we also need to bump the dependency
in the spec file and also the submodule.
The test is also modified and a typo in its name is fixed.
We have the same thing for AWS. The AWS target also specifies under what name
should be the image available in EC2.
As requested by Brew maintainers Tomáš Kopeček and Lubomír Sedlář.
From Koji Content Generator Metadata[1]:
"maven, win, or image: Legacy build type names which appear at this level
instead of inside typeinfo."
=> see, it's legacy
"typeinfo: A map whose entries are the names of the build types used for
this build, which are free form maps containing type-specific information
for this build."
=> struct{} is used for typeinfo.image because the docs says it should contain
"a free form map", null apparently isn't an option.
[1]: https://docs.pagure.org/koji/content_generator_metadata/
As suggested by the Brew maintainers Tomáš Kopeček and Lubomír Sedlář.
Attempt to clarify the structure of our tests. Each test case is now
encapsulated in a script in `test/cases`. Each of these scripts should
be runnable on a pristine machine and be independent of each other. It
is up to the test-orchestractor to decide if they should be run
consequtively instance, or in parallel on separate instances. Each
script can execute several tests and call whatever helper binaries
is desired. However, each case should be assumed to always run as one.
We no longer release into F31, and the right specfile was anyway not
being tested.
This allows us to remove a workaround that updates the VMs during
deploy, and other fedora-31 specific hacks.
This removes the osbuild-composer-cloud package, binary, systemd units,
the (unused) test binary, and the (only-run-on-RHEL) test in aws.sh.
Instead, move the cloud API into the main package, using the same
socket as the koji API, osbuild-composer-api.socket. Expose it next to
the koji API on route `/api/composer/v1`.
This is a backwards incompatible change, but only of the -cloud parts,
which have been marked as subject to change.
Fedora 33 ships the new API so let's do the switch now.
But... this would break older Fedoras because they only have the old API,
right?
We have the following options:
1) Ship xmlrpc compat package to Fedora 33+. This would mean that we delay the API switch till F32 EOL. This would be the most elegant solution, yet it has two issues: a) We will surely not be able to deliver the compat package before F33 Final Freeze. b) It's an extra and annoying work.
2) Downstream patch. No.
3) Use build constraints and have two versions of our code for both different
API.
I chose solution #3. It has an issue though:
%gobuild macro already passes -tags argument to go build. Therefore the
following line fails because it's not possible to use -tags more than once:
%gobuild -tags kolo_xmlrpc_oldapi ...
Therefore I had to come up with manual tinkering with the build constraints
in the spec file. This is pretty ugly but I like that:
1) Go code is actually clean, no weird magic is happening there.
2) We can still ship our software to Fedora/RHEL as we used to
(no downstream patches)
3) All downstreams can use the upstream spec file directly.
Note that this doesn't affect RHEL in any way as it uses vendored libraries.
Fedora 33 ships kolo/xmlrpc with a different API. This commit extracts the
affected code so we can use build flags in the future allowing us to use
both API versions.
Add a simple unit test for the koji API.
This adds a Handler() method to the koji.Server struct, which made
writing the test easier. This is a direction we want to go in anyway in
the future.
Also install it is part of he tests subpackage. This a helper-tool, not
golang code, so should not live in `internal`. We need access to this
from the integration tests, so install it onto the tests system.
Signed-off-by: Tom Gundersen <teg@jklm.no>
No tests should be run directly from git, but should rather be installed
onto the test system using rpm and run from there. This moves towards
unifying our two types of test cases.
The new structure of is now:
`test/cmd`: the executors, one for each test-case. This is installed
into `/usr/libexec/test/osbuild-composer`.
`test/data`: data and config used by the tests. This is installed into
`/usr/share/tests/osbuild-composer`.
`schutzbot`: configuration of the actual test run. In particular, this
is where the distros and repositories to test against are
configured.
This is very much still work-in-progress, and is only the first step
towards simplifying schutzbot. Apart from moving files around, this
should be a noop.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Fedora 33 images can now be built and test cases are added for the new
images. The fedora 33 qcow2 and vmdk images are based off of the
official images and their kickstarters found here:
https://pagure.io/fedora-kickstarts. The fedora 33 iot image is based
off of the the config found here: https://pagure.io/fedora-iot/ostree.
The openstack, azure, and amazon image types have changes made to them
based off of the changes made to the qcow2. The changes between fedora
32 and fedora 33 are as follows:
Grub now loads its kernel command line options from
etc/kernel/cmdline, /usr/lib/kernel/cmdline, and /proc/cmdline instead
of from grub env. This is addressed by adding kernelCmdlineStageOptions
to use osbuild's kernel-cmdline stage to set these options. Alongside
`ro biosdevname=0 net.ifnames=0`, we also set `no_timer_check
console=tty1 console=ttyS0,115200n8` per what is set in the official
qcow2. For azure and amazon, the kernelOptions are still set as they
were in fedora 32.
The timezone is now set to UTC if a user does not set a timezone in the
blueprint customizations. Also, the hostname is set to
localhost.localdomain if the hostname isn't set in the blueprint.
Finally, the following packages have been removed:
polkit
geolite2-city
geolite2-country
zram-generator-defaults
The cloud API will be moved to `/api/composer/v1` in the future.
Mention this in the `servers` section of the openapi.yml (relative URLs
are allowed) too, even though our generator does not consider it.