Commit graph

747 commits

Author SHA1 Message Date
Tomas Hozza
42d623b743 worker/osbuild: support Koji target
Add Koji as a separate upload target to the osbuild job implementation.
2022-06-10 14:48:18 +01:00
Tomas Hozza
bb54318432 worker/osbuild: add host OS and architecture to job result
It is generally useful to have this information in the
`OSBuildJobResult`. This information is currently part of the
`OSBuildKojiJobResult`. Instead of moving it to the new
`KojiTargetResultOptions`, lets move it to the `OSBuildJobResult`
structure and set it for all jobs.
2022-06-10 14:48:18 +01:00
Tomas Hozza
c7e5e3c9c2 Move GetRedHatRelease() and GetHostDistroName() to common package
The `distro` package is now used for distro definitions supported by
osbuild-composer, not for introspecting the Host system. Move
`GetRedHatRelease()` and `GetHostDistroName()` functions to the `common`
package.
2022-06-10 14:48:18 +01:00
Tomas Hozza
804d4210df worker: standardize logging in OCI target
The OCI target used `log`, instead of `logWithId` for logging messages.
Modify the code to be consistent with other targets.
2022-06-10 14:48:18 +01:00
Achilleas Koutsou
8e0db1a4e3 gen-manifests: print message about leftover caches
Print the location of the cache directory to the user in case they want
to clean or inspect it.
2022-06-10 12:45:41 +01:00
Sanne Raymaekers
8d5cdfdd57 osbuild-worker: Correct cast of dnfjson error in depsolve job
This error is failing to parse correctly on the workers as a
dnfjson.Error. The old rpmmd.DNFError was returned by pointer, however
the internal/dnfjson package returns the Error by value.
2022-06-08 23:07:37 +02:00
Sanne Raymaekers
ff408aa68f osbuild-service-maintenance: Vacuum tables
Call vacuum analyze after each chunk of updates, and dump vacuum stats
at the beginning and end of the db cleanup.

Nulling results can increase size on disk, but calling vacuum analyze
will free up space within the table (not on disk) and reuse the space
for new inserts and updates.
2022-06-08 21:12:46 +02:00
Sanne Raymaekers
8bfc6c9961 dbjobqueue: Filter maintenance queries based on results
Jobs that already had their results nulled, shouldn't be included in the
maintenance job.
2022-06-08 21:12:46 +02:00
Tomas Hozza
8635b7d2bb dbjobqueue-tests: fix issue introduced by PR #2618 2022-06-08 14:28:03 +02:00
Sanne Raymaekers
92ae2f7c83 osbuild-service-maintenance: Delete/update results in chunks
The results of the manifest jobs can be very big, and operating on
30-40k rows at once can starve or crash a smaller rds instance.
2022-06-06 17:49:46 +02:00
Sanne Raymaekers
9b119fa4cf osbuild-service-maintenance: Delete results from select jobs
Instead of deleting records, delete the results from the manifest and
depsolve jobs. This redacts sensitive data which the manifest can
contain, and this conserves space.
2022-06-03 14:38:53 +02:00
Sanne Raymaekers
eeb2238b12 osbuild-service-maintenance: Split out db cleanup 2022-06-03 14:38:53 +02:00
Ygal Blum
feb357e538 Support Generic S3 upload in Composer API
Use case
--------
If Endpoint is not set and Region is - upload to AWS S3
If both the Endpoint and Region are set - upload the Generic S3 via Weldr API
If neither the Endpoint and Region are set - upload the Generic S3 via Composer API (use configuration)

jobimpl-osbuild
---------------
Add configuration fields for Generic S3 upload
Support S3 upload requests coming from Weldr or Composer API to either AWS or Generic S3
Weldr API for Generic S3 requires that all connection parameters but the credentials be passed in the API call
Composer API for Generic S3 requires that all conneciton parameters are taken from the configuration
Adjust to the consolidation in Target and UploadOptions

Target and UploadOptions
------------------------
Add the fields that were specific to the Generic S3 structures to the AWS S3 one
Remove the structures for Generic S3 and always use the AWS S3 ones

Worker Main
-----------
Add Endpoint, Region, Bucket, CABundle and SkipSSLVerification to the configuration structure
Pass the values to the Server

Weldr API
---------
Keep the generic.s3 provider name to maintain the API, but unmarshel into awsS3UploadSettings

tests - api.sh
--------------
Allow the caller to specifiy either AWS or Generic S3 upload targets for specific image types
Implement the pieces required for testing upload to a Generic S3 service
In some cases generalize the AWS S3 functions for reuse

GitLab CI
---------
Add test case for api.sh tests with edge-commit and generic S3
2022-06-02 16:12:53 +03:00
Achilleas Koutsou
9fda1ff55f dnfjson: cache cleanup
Added CleanCache() method to the solver that deletes all the caches if
the total size grows above a certain (configurable) limit
(default: 500 MiB).

The function is called externally to handle errors (usually log or
ignore completely) and to avoid calling multiple times for multiple
depsolves of a single request.

The cleanup is extremely simple and is meant as a placeholder for more
sophisticated cache management.  The goal is to simply avoid ballooning
cache sizes that might cause issues for users or our own services.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
8b4607c94f gen-manifests: do not return workerName from makeManifestJob
The value doesn't represent the worker name, just the top-level cache
directory for a job.  It's useful for separating caches and making the
generation faster, but it's not necessary to return from the function.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
d470a3cb3f gen-manifests: inline finish() into wait()
wait() just did finish() and returned errors; no need for two
functions.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
150d490ba8 gen-manifests: separate worker queue code
Add the worker queue code to a separate file for better organisation
and readability.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
c1f7003e12 genall: move to cmd/ and rename to gen-manifests 2022-06-01 11:36:52 +01:00
Achilleas Koutsou
7a70a5e69b dnfjson: drop repo checksums
The repository checksums in the response from dnf-json aren't used
anywhere.  Since we're making changes to dnf-json and depsolving, now is
a good opportunity to drop them completely.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
c092783a70 simplify package set chain handling
Move package set chain collation to the distro package and add
repositories to the package sets while returning the package sets from
their source, i.e., the ImageType.PackageSets() method.

This also removes the concept of "base repositories".  There are no
longer repositories that are added implicitly to all package sets but
instead each package set needs to specify *all* the repositories it will
be depsolved against.

This paves the way for the requirement we have for building RHEL 7
images with a RHEL 8 build root.  The build root package set has to be
depsolved against RHEL 8 repositories without any "base repos" included.
This is now possible since package sets and repositories are explicitly
associated from the start and there is no implicit global repository
set.

The change requires adding a list of PackageSet names to the core
rpmmd.RepoConfig.  In the cloud API, repositories that are limited to
specific package sets already contain the correct package set names and
these are now copied to the internal RepoConfig when converting types in
genRepoConfig().
The user-specified repositories are only associated with the payload
package sets like before.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
6fbddeea35 composer+worker: make dnf-json path externally configurable
The default value is the installation path.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
86536f11e7 rpmmd: add Repositories list to PackageSet struct
Attach the repository configurations that are specific to a package set
directly on the PackageSet object.  This simplifies the Depsolve()
signature and avoids requiring a `nil` when no additional repositories
are required.  More importantly, it makes associating repositories to
package sets explicit, no longer relying on matching array indices or
map keys.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
1c4d8f9988 dnfjson: use repo config hash as repo ID
Defined a Hash() method on rpmmd.RepoConfig that calculates a SHA-256 ID
for a repository based on its configuration.  Identical configurations
should produce the same ID.  The Name and ImageTypeTags of a repository
aren't taken into account.  These attributes affect a repository's
functional configuration.

This ID lets us change the way we handle repository configurations in a
few places:
- Preparing the depsolve job arguments is simpler since we have
  predictable IDs for the repository configurations.  We don't need to
  rely on the index of a RepoConfig in a list to identify or access it,
  which prevented us from building a list of all repository
  configurations, since we needed them to be placed in the list in a
  certain order.
- Associating packages from the depsolve result with the repository
  configuration (in depsToRPMMD) no longer relies on an ID string
  converted from and back to an integer index.  Repositories define
  their own IDs.
- Tests are a bit messier now but the changes simplify the main code, so
  it's an acceptable trade-off.
    - Fixtures need to change based on the repository configuration for
      the test.
    - We need to calculate the ID for the repository configuration for
      the temporary file server URL.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
61d7c465af dnfjson: remove single Depsolve function and command
Remove the single Depsolve function from the dnfjson package and the
depsolve command from the dnf-json tool.  The new ChainDepsolve
functions and chain-depsolve command can handle single depsolves in the
same way so there's no need to keep (and have to maintain) two versions
of very similar code.

The ChainDepsolve function (in Go) and chain-depsolve command (in
Python) have been renamed to plain Depsolve and depsolve respectively,
since they are now general purpose depsolve functions.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
d09176893b cmd/osbuild-pipeline: find dnf-json binary
Search for (and set) the path for dnf-json by checking a few known
locations:
- ./dnf-json: for situations when the tool is ran from the source tree.
  This is checked first to prioritise local changes.
- /usr/libexec/osbuild-composer/dnf-json: the default install location
  of the script when osbuild-composer is installed.
- /usr/lib/osbuild-composer/dnf-json: the default install location of
  the script for distributions which don't use /usr/libexec.

The function panics with an informative error message when it fails to
find dnf-json.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
8a23a77c5b worker: add new error type for RepoError
dnf-json now returns a new error kind: RepoError
Add it to the list of known error types and handle it in the worker.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
177ea1b08f Replace all rpmmd.Depsolve() calls with dnfjson
All calls to rpmmd.Depsolve() are now replaced with the equivalent call
to solver.Depsolve() (or dnfjson.Depsolve() for one-off calls).

Attached an unconfigured dnfjson.BaseSolver to all APIs and server
configurations where rpmmd.RPMMD used to be.  This BaseSolver instance
loads the repository credentials from the system and carries the cache
directory, much like the RPMMD field used to do.  The BaseSolver is used
to create an initialised (configured) solver with the platform variables
(module platform ID, release ver, and arch) before running a Depsolve()
or FetchMetadata() using the NewWithConfig() method.

The FillDependencies() call in the modulesInfoHandler() of the weldr API
has been replaced by a direct call to the Depsolve() function.  This
rpmmd function was only used here.  Replacing the rpmmd.Depsolve() call
in rpmmd.FillDependencies() with dnfjson.Depsolve() would have created
an import cycle.  The FillDependencies() function could have been moved
to dnfjson, but since it's only used in one place, moving the one-line
function body into the caller is ok.

For testing:

The mock-dnf-json is compiled to a temporary directory during test
initialisation and used for each Depsolve() or FetchMetadata() call.

The weldr API tests now use the mock dnfjson.  Each rpmmd_mock.Fixture
now also has a dnfjson_mock.ResponseGenerator.

All API calls in the tests use the proper functions from dnfjson and
only the dnf-json script is mocked.  Because of this, some of the
expected results in responses_test had to be changed to match correct
behaviour:
- The "builds" array of each package in the result of a module or
  project list is now sorted by version number (ascending) because we
  sort the package list in the result of dnfjson by NVR.
- 'check_gpg: true' is added to the expected response of the depsolve
  test.  The repository configs in the test weldr API specify 'CheckGPG:
  True', but the mock responses returned it as false, so the expected
  result didn't need to include it.  Since now we're using the actual
  dnfjson code to convert the mock response to the internal structure,
  the repository settings are correctly used to set flag to true for
  each package associated with that repository.
- The word "occurred" was mistyped as "occured" in rpmmd and is now
  fixed in dnfjson.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
e9a7a50496 Add dnfjson mock data package and cmd
The cases are directly copied (or lightly adapted) from
rpmmd_mock/fixtures.

The purpose of the mocks/dnfjson package is to create files with data
for testing the dnfjson package without the need to call the dnf-json
script.  Each public function creates a file with test responses in the
same format as the dnf-json script's responses (either valid results or
errors).  The dnfjson.Solver can be configured to call the new
./cmd/mock-dnf-json program with the test data file as an argument and a
valid dnf-json request for input.  The mock-dnf-json checks the input
request for unknown fields before responding with the contents of the
file.

Each test case file contains two responses, one for each command
supported by dnf-json: "depsolve" and "dump".  mock-dnf-json responds
with the appropriate data based on the command in the request.  This is
necessary for tests that require both commands in the same call, e.g.,
tests of the weldr API's moduleInfoHandler() which fetches a package
list and then needs to depsolve a subset of those packages.

There are also cases when we want one of the two responses to be an
error.  The mock-dnf-json program will return with an error code if it
can successfully unmarshal the intended response into the dnfjson.Error
type.
2022-06-01 11:36:52 +01:00
Sanne Raymaekers
71c78991a6 cloudapi: Drop bucket from composer config
This value is set in the worker config. In future it might also be
passed through the api to upload into target accounts, but it should
never be set in composer.
2022-06-01 12:03:12 +02:00
Ygal Blum
8407c97d96 Upload to HTTPS S3 - Support self signed certificate
API
---
Allow the user to pass the CA public certification or skip the verification

AWSCloud
--------
Restore the old version of newAwsFromCreds for access to AWS
Create a new method newAwsFromCredsWithEndpoint for Generic S3 which sets the endpoint and optionally overrides the CA Bundle or skips the SSL certificate verification

jobimpl-osbuild
---------------
Update with the new parameters

osbuild-upload-generic-s3
-------------------------
Add ca-bunlde and skip-ssl-verification flags

tests
-----
Split the tests into http, https with certificate and https skip certificate check
Create a new base test for S3 over HTTPS for secure and insecure
Move the generic S3 test to tools to reuse for secure and insecure connections
All S3 tests now use the aws cli tool
Update the libvirt test to be able to download over HTTPS
Update the RPM spec

Kill container with sudo
2022-05-26 13:46:00 +03:00
Sanne Raymaekers
7529382890 go.mod: Update openshift-online/ocm-sdk-go
This requires golang-jwt/jwt/v4.
2022-05-19 22:18:42 +02:00
Tomas Hozza
1017aee438 cloud-cleaner: clean up GCE instances in all regions and zones
Since the `api.sh` test case is using random GCE zone from a random GCE
region which name starts with the `GCP_REGION` CI environment variable.
Since the used region name is not known to the `cloud-cleaner`, it has
to iterate over all potential GCE regions and their zones. We can not
simply filter the VM instance name a list of instances, because any
`instances` API call requires a zone name to be provided.

Add a new internal `cloud/gcp` package method to list existing GCE
regions based on a provided filter.
2022-05-17 12:18:12 +02:00
Sanne Raymaekers
d1911f6484 osbuild-service-maintenance: Move type conversion to config 2022-05-14 16:21:21 +02:00
Sanne Raymaekers
8219dcdee8 osbuild-service-maintenance: Explicitly enable maintenance parts
Stage and production share the GCP account. To avoid trying to delete
each GCP image twice, the maintenance script needs the ability to
selectively disable certain parts based on the config.
2022-05-14 16:21:21 +02:00
Tomas Hozza
04f612d758 Manifests test: ensure that every image type has test coverage
Extend the manifests test to ensure that each an every image type of
each architecture and each distribution is covered by at least one
image test case.

Since now we have the ability to generate image test cases for more
complicated image types, which consists only of the manifest, we should
have test coverage for each and every image type.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-05-13 21:01:37 +03:00
Tomas Hozza
40c095a850 Tools: fetch image test case generation matrix from composer
Add a simple tool `osbuild-composer-image-definitions` which dumps the
matrix of all distributions, architectures and image types names
supported by composer as a JSON to the stdout.

Default to fetching the image test case generation matrix directly from
composer. This eliminates the need to update a JSON source file with
this information every time a new distro or image type are added to
composer.

Delete the previously used JSON source file with the image test case
generation matrix.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-05-13 21:01:37 +03:00
Thomas Lavocat
c00aae0a4a worker: provide the region for the ASG
Before, the autoscaling group discovery is failing with error:
Error getting the Autoscaling instances: MissingRegion MissingRegion:
could not find region configuration
2022-05-13 11:52:30 +02:00
Diaa Sami
33711d7d51 composer: add support for logrus syslog hook
Which will be used on crc in the log forwarding setup
NeededBy: COMPOSER-1285
2022-05-12 11:02:27 +02:00
Jordi Gil
616258ee25 distro: housekeeping with cpu arch and arch.Name() 2022-05-10 19:53:41 +02:00
Thomas Lavocat
ab7fe6558a worker: protect the instance from upgrading
Before the instance was vulnerable to an OTA update while processing a
request. Because there is no way of retriggering a job in Composer, it
is better to avoid this situation.
The way we are doing it is by setting the `protected` flag onto the
instance when a job is being processed. This way the AWS scheduler
does hopefully not shutdown the machine at the wrong time.

Main caveats of this solution:
* Starvation: If a worker keeps accepting new jobs, then it might not be
  updated.
* Inconsistency: There exist a window between the job acceptation and the
  protection where the worker can be shutdown without having the time to
  protect itself.
2022-05-10 11:45:29 +02:00
Jordi Gil
f14dc2fb63 distro/fedora: refactor based on RHEL 9.0 code 2022-05-09 12:25:21 +02:00
Tomas Hozza
0bf67dfad5 Stop setting the StreamOptimized option in Weldr and Cloud APIs
The VMDK image is already produced as stream-optimized. Therefore stop
setting the `StreamOptimized` option in `OSBuildJob` structure by both,
Weldr and Cloud APIs.

Keep the handling of the option in worker for backward compatibility,
in case an older instance of Composer server is used, which does not
produce VMDK manifests as stream-optimized. In such case, the worker
needs to convert the image.
2022-05-04 16:22:29 +02:00
Ondřej Budai
6fce34a5ea worker: add proxy support to composer and oauth calls
In the internal deployment, we want to talk with composer over a http/https
proxy. This proxy adds new composer.proxy field to the worker config that
causes the worker to connect to composer and the oauth server using
a specified proxy.

NB: The proxy is not supported when connection to composer via unix sockets.

For testing this, I added a small HTTP proxy implementation, pls don't
use this in production, it's just good enough for tests.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
6e9901fe6b worker: exit(2) when address is missing from argv
Address is always required so not passing one is a clear error, let's return
exit code 2 which go itself returns when bad arguments are passed in.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Ondřej Budai
6e92263c23 worker: rename config field in Go to reflect its toml name
For the sake of consistency, not a functional change.

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2022-05-03 06:19:31 +01:00
Tomas Hozza
e819e08098 worker: extend the depsolve job to use DepsolvePackageSets()
Extend the `DepsolveJob` worker job argument to contain package sets
chains and use `DepsolvePackageSets()` for depsolving.
2022-04-28 14:42:49 +02:00
Tomas Hozza
ac8b0b211c osbuild-store-dump: use DepsolvePackageSets instead of Depsolve 2022-04-28 14:42:49 +02:00
Tomas Hozza
906e88ea8c osbuild-pipeline: use DepsolvePackageSets instead of Depsolve 2022-04-28 14:42:49 +02:00
Tomas Hozza
ef4db9edda rpmmd: introduce DepsolvePackageSets() to the RPMMD interface
Add a convenience method `DepsolvePackageSets()` to the `RPMMD`
interface. The method is expected to depsolve all provided package sets
in a chain or separately, based on the provided arguments, and return
depsolved PackageSpecs sets.

The intention is to have a single implementation of how are package sets
depsolved and then use it from all places in composer (API and tools
implementations).

Adjust necessary mock implementations and add a unit test testing the
new interface method implementation.
2022-04-28 14:42:49 +02:00
Tomas Hozza
ee285e5e8a Weldr: support GCP upload target
Add support for importing the GCE image into GCP using Weldr API. The
credentials to be used can be specified in the upload settings and will
be then used by the worker to authenticate with GCP.

The GCP target credentials are passed to Weldr API as base64 encoded
content of the GCP credentials JSON file. The reason is that the JSON
file contains many values and its format could change in the future.
This way, the Weldr API does not rely on the credentials file content
format in any way.

Add a new test case for the GCP upload via Weldr and run it in CI.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-04-14 19:07:31 +01:00