Define supported job type names as constants and use them in all places,
instead of string literals.
There are multiple benefits of this approach. Using constants removed
the room for typos in the string literals. One can use autocompletion in
IDE for job types. Using constant makes it easier to find all references
where it is used and thus all places that are handling a specific job
type.
Added CleanCache() method to the solver that deletes all the caches if
the total size grows above a certain (configurable) limit
(default: 500 MiB).
The function is called externally to handle errors (usually log or
ignore completely) and to avoid calling multiple times for multiple
depsolves of a single request.
The cleanup is extremely simple and is meant as a placeholder for more
sophisticated cache management. The goal is to simply avoid ballooning
cache sizes that might cause issues for users or our own services.
The repository checksums in the response from dnf-json aren't used
anywhere. Since we're making changes to dnf-json and depsolving, now is
a good opportunity to drop them completely.
Move package set chain collation to the distro package and add
repositories to the package sets while returning the package sets from
their source, i.e., the ImageType.PackageSets() method.
This also removes the concept of "base repositories". There are no
longer repositories that are added implicitly to all package sets but
instead each package set needs to specify *all* the repositories it will
be depsolved against.
This paves the way for the requirement we have for building RHEL 7
images with a RHEL 8 build root. The build root package set has to be
depsolved against RHEL 8 repositories without any "base repos" included.
This is now possible since package sets and repositories are explicitly
associated from the start and there is no implicit global repository
set.
The change requires adding a list of PackageSet names to the core
rpmmd.RepoConfig. In the cloud API, repositories that are limited to
specific package sets already contain the correct package set names and
these are now copied to the internal RepoConfig when converting types in
genRepoConfig().
The user-specified repositories are only associated with the payload
package sets like before.
Attach the repository configurations that are specific to a package set
directly on the PackageSet object. This simplifies the Depsolve()
signature and avoids requiring a `nil` when no additional repositories
are required. More importantly, it makes associating repositories to
package sets explicit, no longer relying on matching array indices or
map keys.
Remove the single Depsolve function from the dnfjson package and the
depsolve command from the dnf-json tool. The new ChainDepsolve
functions and chain-depsolve command can handle single depsolves in the
same way so there's no need to keep (and have to maintain) two versions
of very similar code.
The ChainDepsolve function (in Go) and chain-depsolve command (in
Python) have been renamed to plain Depsolve and depsolve respectively,
since they are now general purpose depsolve functions.
All calls to rpmmd.Depsolve() are now replaced with the equivalent call
to solver.Depsolve() (or dnfjson.Depsolve() for one-off calls).
Attached an unconfigured dnfjson.BaseSolver to all APIs and server
configurations where rpmmd.RPMMD used to be. This BaseSolver instance
loads the repository credentials from the system and carries the cache
directory, much like the RPMMD field used to do. The BaseSolver is used
to create an initialised (configured) solver with the platform variables
(module platform ID, release ver, and arch) before running a Depsolve()
or FetchMetadata() using the NewWithConfig() method.
The FillDependencies() call in the modulesInfoHandler() of the weldr API
has been replaced by a direct call to the Depsolve() function. This
rpmmd function was only used here. Replacing the rpmmd.Depsolve() call
in rpmmd.FillDependencies() with dnfjson.Depsolve() would have created
an import cycle. The FillDependencies() function could have been moved
to dnfjson, but since it's only used in one place, moving the one-line
function body into the caller is ok.
For testing:
The mock-dnf-json is compiled to a temporary directory during test
initialisation and used for each Depsolve() or FetchMetadata() call.
The weldr API tests now use the mock dnfjson. Each rpmmd_mock.Fixture
now also has a dnfjson_mock.ResponseGenerator.
All API calls in the tests use the proper functions from dnfjson and
only the dnf-json script is mocked. Because of this, some of the
expected results in responses_test had to be changed to match correct
behaviour:
- The "builds" array of each package in the result of a module or
project list is now sorted by version number (ascending) because we
sort the package list in the result of dnfjson by NVR.
- 'check_gpg: true' is added to the expected response of the depsolve
test. The repository configs in the test weldr API specify 'CheckGPG:
True', but the mock responses returned it as false, so the expected
result didn't need to include it. Since now we're using the actual
dnfjson code to convert the mock response to the internal structure,
the repository settings are correctly used to set flag to true for
each package associated with that repository.
- The word "occurred" was mistyped as "occured" in rpmmd and is now
fixed in dnfjson.
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Channels are a concept similar to job types. Callers must specify a channel
name when queueing a new job. A list of channels is also specified when
dequeueing a job. The dequeued job's channel will always be from one of the
specified channel. Of course, the job types are also respected. The dequeued
job will also always be from one of the specified type.
Currently, all calls to jobqueue were changed so all queue operations use
an empty channel name and all dequeue operations use a list containing
an empty channel.
Thus, this is a non-functional change.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Replace Job() and JobStatus() with typesafe versions, and introduce JobType()
for the rare instances where we don't know the type up front.
Additionally, catch a few more error cases:
- if OSBuildResult is nil, then we failed to invoke osbuild
- make sure the same JobResult handling is done for osbuild-koji, as for osbuild
When we compute the overall status of a koji compose, the individual
build jobs are checked. Currently, a job is considered a failure, if
a build job has output (`OSBuildOutput`) and the output's `Success`
field is `false`. But `OSBuildOutput` will be `nil` when osbuild
crashed or refused the manifest input. Therefore the job status is
a failure if `OSBuildOutput` is `nil`, since if osbuild was run,
and the run was successful we must have a non-`OSBuildOutput` field.
Since the workers will use structured error messages
going forward, it is necessary to maintain backwards
compatability for there errors in composer. Tests have
been added to the various apis to ensure that each api
checks for both kinds of errors, old and new.
Implement the structured errors as defined by the worker client.
Every error for each of the job types now returns a structured
error with a reason and a specific error code. This will make
it possible to differentiate between 4xx errors and 5xx errors.
This commit refactors the way errors are implemented in the workers,
but maintains backwards compatability in composer by checking for
both kinds of errors.
We have been actually unmarshalling into a wrong datatype for a year, by
fixing this, we should get much more logging in Brew.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Pipeline names are added to each job before adding to the queue. When a
job is finished, the names are copied to the Result object as well. This
is done for both OSBuild and Koji jobs.
The pipeline names in the result are primarily used to separate package
lists into build and payload/image packages in two cases:
1. Koji builds: for reporting the build root and image package lists to
Koji (in Koji finalize).
2. Cloud API (v1 and v2): for reporting the payload packages in the
metadata request.
The pipeline names are also used to print the system log output in the
order in which pipelines are executed. This still isn't used when
printing the OSBuild Result (osbuild2.Result.Write()) and we still rely
on sorting by pipeline name
(see https://github.com/osbuild/osbuild-composer/pull/1330).
The main changes are:
- Kind, Href, Id fields for every object returned
- Attach operationIds to each request, return it for errors
- Errors are predefined and queryable
The problem: osbuild-composer used to have a rather uncomplete logic for
selecting client certificates and keys while fetching data from
repositories that use the "subscription model". In this scenario, every
repo requires the user to use a client-side TLS certificate. The problem
is that every repo can use its own CA and require a different pair of
a certificate and a key. This case wasn't handled at all in composer.
Furthermore, osbuild-composer can use remote workers which complicates
things even more.
Assumptions: The problem outlined above is hard to solve in the general
case, but Red Hat Subscription Manager places certain limitations on how
subscriptions might be used. For example, a subscription must be tight to
a host system, so there is no way to use such a repository in osbuild-composer
without it being available on the host system as well.
Also, if a user wishes to use a certain repository in osbuild-composer it
must be available on both hosts: the composer and the worker. It will come
with different pair of a client certificate and a key but otherwise, its
configuration remains the same.
The solution: Expect all the subscriptions to be registered in the
/etc/yum.repos.d/redhat.repo file. Read the mapping of URLs to certificates
and keys from there and use it. Don't change the manifest format and let
osbuild guess the appropriate subscription to use.
Koji image request handling now reads the exports defined by each image
type. All APIs now support reading the exports defined by each image
type. The worker still falls back to "assembler" in case the call comes
from an older version of composer.
fedoratest was yet another dummy distribution used by unit tests. After
the rework of test_distro, there is no reason to not use it as the only
distro implementation for testing purposes.
Remove fedoratest distro and replace it with test_distro in all affected
tests.
Signed-off-by: Tomas Hozza <thozza@redhat.com>
My goal is to add a method to distroregistry to return Registry with
all supported distributions. This way, all supported distributions
would be defined only on one place.
To achieve this, the Registry must live outside the distro package
because the distro implementation depends on it and this would create
a circular dependency unsupported by Go.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
This replaces Packages() and BuildPackages() by returning a map of
package sets, the semantics of which is up to the distro to define.
They are meant to be depsolved and the result returned back as a
map to Manifest(), with the same keys.
No functional change.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Send requests to the compose/{id}, compose/{id}/logs, and
compose/{id}/manifests using job IDs for non-finalize type jobs to test
the type verification.
The type verification introduced in the previous commit is now also used
when retrieving the job status (GET /compose/{id}) and the logs (GET
/compose/{id}/logs).
In these cases, job retrieval needs to be performed twice:
1. First the job parameters are retrieved (Job()) to check the type.
2. Then the job result is retrieved (JobStatus()) for the status or
logs.
This makes it unlikely (essentially impossible) that the retrieval will
fail with "not found" on the second retrieval (JobStatus()), but it's
still a good sanity check for the system.
Verifies the Koji job types when retrieving Init and Build jobs as well.
When a job's arguments are retrieved (for the /manifests API endpoint),
the incoming ID should correspond to a Finalize Job. The new
worker.Job() method helps us verify the type and produce an error if the
wrong type is found.
Similarly, the dependencies of a Finalize Job should be in order (Init
Job first followed by Build Jobs). The types are validated while
iterating the dependency list.
Added convenience functions that check the retrieved job type and return
the initialised struct or an error if the ID is not found or does not
match the type.
Currently the getInitJob() function isn't used but it will be useful
later.
JobArgs() function replaced with more general Job() function that
returns all the parameters used to originally define a job during
Enqueue(). This new function enables access to the type of a job in the
queue, which wasn't available until now (except when Dequeueing).
1. Test retrieving the job arguments from the worker using an
OSBuildJob.
Very basic test with an empty manifest.
2. Test retrieving job manifests through the Koji API. Uses the existing
TestCompose test. The returned manifests are empty for all cases.
Add and implement an API endpoint that returns the manifests for a given
osbuild-koji job. The route returns an array of manifests, one for each
image in the request.
Retrieves the arguments of each Build Job to extract the Manifest.
When rpmmd's Depsolve function is called we need to pass in the image
type's excluded packages. These excluded packages are retrieved when we
get the packages we include from each image type.
I thought rand in Go is auto-seeded but I was wrong, see [1].
This commit adds seed initialization.
[1]: https://golang.org/pkg/math/rand/#Seed
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Imagine this situation: You have a RHEL system booted from an image produced
by osbuild-composer. On this system, you want to use osbuild-composer to
create another image of RHEL.
However, there's currently something funny with partitions:
All RHEL images built by osbuild-composer contain a root xfs partition. The
interesting bit is that they all share the same xfs partition UUID. This might
sound like a good thing for reproducibility but it has a quirk.
The issue appears when osbuild runs the qemu assembler: it needs to mount all
partitions of the future image to copy the OS tree into it.
Imagine that osbuild-composer is running on a system booted from an imaged
produced by osbuild-composer. This means that its root xfs partition has this
uuid:
efe8afea-c0a8-45dc-8e6e-499279f6fa5d
When osbuild-composer builds an image on this system, it runs osbuild that
runs the qemu assembler at some point. As I said previously, it will mount
all partitions of the future image. That means that it will also try to
mount the root xfs partition with this uuid:
efe8afea-c0a8-45dc-8e6e-499279f6fa5d
Do you remember this one? Yeah, it's the same one as before. However, the xfs
kernel driver doesn't like that. It contains a global table[1] of all xfs
partitions that forbids to mount 2 xfs partitions with the same uuid.
I mean... uuids are meant to be unique, right?
This commit changes the way we build RHEL 8.4 images: Each one now has a
unique uuid. It's now literally a unique universally unique identifier. haha
[1]: a349e4c659/fs/xfs/xfs_mount.c (L51)
Previously, the compose status returned failure as soon as possible.
koji-osbuild considers the job as done when its status == failure and proceeds
with uploading the logs to koji and marking the job as failed. However, not
all osbuild-composer jobs might be done at this point so the logs might be
incomplete making the debugging hard.
This commit changes the behaviour: Now, the compose status is pending until
ALL jobs belonging to it are finished.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Previously, we had no clue what errors were catched by the default echo's
error handler. Thus, in the case of an error, we were basically blind. Let's
log all errors so we can investigate them later.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
The previous code was smelling a bit (e.g. Server.server field) so I decided
to rewrite it in the style of the much nicer koji server.
Not a functional change.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Serializing an interface does not work, let us simply use the string
representation and treat the empty string as no error. This is
compatible with the current API in the success case, and fixes the
error case, which is currently broken.
Also extend the test matrix for the kojiapi to ensure that all the
different kinds of errors can be serialized correctly and leads to
the correct status being returned.
Fixes#1079 and #1080.
Soon, we want to begin tagging the jobs with the name of its submitter.
The simplest way to add a tag to a job is to put it into its type string.
However, as we don't know (and don't want to know) the submitters' names when
osbuild-composer is initialized, we need to be able to push arbitrary job
types into the jobqueue.
This commit therefore lifts the restriction that a jobqueue accepts only
a predefined set of job types. Now, jobqueue clients can push jobs of
arbitrary names.
Signed-off-by: Ondřej Budai <ondrej@budai.cz>
Now that all interaciton with the koji API happens in the workers
we can drop koji configuration from composer itself. This means
that composer no longer needs to be provisioned with kerberos
credentials, and does not need to know about which koji servers
the workers support.
This is no longer returned when creating a compose, as it is obtained
asynchronously.
The TaskID is still returned, and is always set to 0. This is not right,
and should either be fixed or dropped. The caller should know the TaskID
(if they have one), so this is redundant and currently unused.