Apply a RWMutex lock to a cache directory.
A global map of cache locks is maintained, keyed by the absolute path to
the cache directory, so multiple cache instances can coexist and share
locks if they use the same cache root.
Currently, the lock only prevents multiple concurrent `shrink()`
operations when multiple cache instances share the same root.
- Update timestamps for cache elements whenever a repository is used.
- Call the new `shrink()` function instead of the old `clean()`.
- Remove the old `clean()` function.
If the repoRecency and repoElements somehow become inconsistent (an ID
in repoRecency does not exist in repoElements), ignore and continue.
The repoID will be removed from the repoRecency list at the end as it's
still counted in the nDeleted.
Functions for managing repository cache management based on a max
desirable size for the entire dnf-json cache directory.
While none of the functions are currently used, the workflow should
be as follows:
- Update the timestamp of a repository whenever it's used in a
transaction by calling `touchRepo()` with the repository ID and the
current time.
- Update the internal cache information when desired by calling
`updateInfo()`. This should be called for example after multiple
depsolve transactions are run for a single build request.
- Shrink the cache to below the configured maxSize by calling
`shrink()`.
The most important work happens in `updateInfo()`. It collects all the
information it needs from the on-disk cache directories and organises it
in a way that makes it convenient for the `shrink()` function to run
efficiently. It stores three important pieces of information:
1. repoElements: a map that links a repository ID with all the
information about a repository's cache:
- the top-level elements (files and directories) for the cache
- size of the repository cache (total of all elements)
- most recent mtime from all the elements which, if the
`touchRepo()` call is consistently used, should reflect the most
recent time the repository was used
2. repoRecency: a list of repository IDs sorted by mtime (oldest first)
3. size: the total size of the cache (total of all repository caches)
This way, when `shrink()` is called, the paths associated with the
least-recently-used repositories can be easily deleted by iterating on
repoRecency, obtaining the repository info from the map, deleting every
path in the repoElements array, and subtracting the repository's size
from the total. The `shrink()` function stops when the new size is
below the maxSize (or when all repositories have been deleted).
Move cache handling data and code to a substruct of the BaseSolver.
This is all internal to the dnfjson package.
Paves the way for cache management with a persistent state.
The `amdgpu` module causes issues on certain GPU-enabled instances
on Azure and it must not be loaded by default.
Modules are sorted alphabetically.
Signed-off-by: Major Hayden <major@redhat.com>
Co-Authored-By: Christian Kellner <christian@redhat.com>
This error is failing to parse correctly on the workers as a
dnfjson.Error. The old rpmmd.DNFError was returned by pointer, however
the internal/dnfjson package returns the Error by value.
Call vacuum analyze after each chunk of updates, and dump vacuum stats
at the beginning and end of the db cleanup.
Nulling results can increase size on disk, but calling vacuum analyze
will free up space within the table (not on disk) and reuse the space
for new inserts and updates.
Include a tenant label for all prometheus metrics. Modify
jobstatus function in the worker accordingly to return channel
so it can be passed to prometheus.
In commit 5c1530e we disabled `skx_edac` and `intel_cstate` but
after further consultation with Prarit Bhargava it was agreed that
for RHEL 9 we should indeed allow them.
Instead of deleting records, delete the results from the manifest and
depsolve jobs. This redacts sensitive data which the manifest can
contain, and this conserves space.
Use case
--------
If Endpoint is not set and Region is - upload to AWS S3
If both the Endpoint and Region are set - upload the Generic S3 via Weldr API
If neither the Endpoint and Region are set - upload the Generic S3 via Composer API (use configuration)
jobimpl-osbuild
---------------
Add configuration fields for Generic S3 upload
Support S3 upload requests coming from Weldr or Composer API to either AWS or Generic S3
Weldr API for Generic S3 requires that all connection parameters but the credentials be passed in the API call
Composer API for Generic S3 requires that all conneciton parameters are taken from the configuration
Adjust to the consolidation in Target and UploadOptions
Target and UploadOptions
------------------------
Add the fields that were specific to the Generic S3 structures to the AWS S3 one
Remove the structures for Generic S3 and always use the AWS S3 ones
Worker Main
-----------
Add Endpoint, Region, Bucket, CABundle and SkipSSLVerification to the configuration structure
Pass the values to the Server
Weldr API
---------
Keep the generic.s3 provider name to maintain the API, but unmarshel into awsS3UploadSettings
tests - api.sh
--------------
Allow the caller to specifiy either AWS or Generic S3 upload targets for specific image types
Implement the pieces required for testing upload to a Generic S3 service
In some cases generalize the AWS S3 functions for reuse
GitLab CI
---------
Add test case for api.sh tests with edge-commit and generic S3
Added CleanCache() method to the solver that deletes all the caches if
the total size grows above a certain (configurable) limit
(default: 500 MiB).
The function is called externally to handle errors (usually log or
ignore completely) and to avoid calling multiple times for multiple
depsolves of a single request.
The cleanup is extremely simple and is meant as a placeholder for more
sophisticated cache management. The goal is to simply avoid ballooning
cache sizes that might cause issues for users or our own services.
The value doesn't represent the worker name, just the top-level cache
directory for a job. It's useful for separating caches and making the
generation faster, but it's not necessary to return from the function.
This test was removed because package sets in chains are no longer
visible in the map returned from ImageType.PackageSets().
Bringing back the test now to ensure that:
1. All package set names defined in the keys returned from the
PackageSets() map match the keys returned from the
PackageSetsChains() map.
2. All package sets defined in the package set chains are defined for
the image type. This is tested by the function PackageSets()
function itself, which should never panic.
They were originally added as convenience functions for single-case
calls, but they're not that useful and they have a million function
arguments, which isn't pretty.
The repository checksums in the response from dnf-json aren't used
anywhere. Since we're making changes to dnf-json and depsolving, now is
a good opportunity to drop them completely.
- Standalone executable for generating all test manifests in parallel.
- Command line flags:
- Output directory (-output)
- Number of concurrent workers (-workers)
- Collects list of image types from the distro list and reads:
- tools/test-case-generators/repos.json for repositories
- tools/test-case-generators/format-request-map.json for
customizations
- Prints progress (finished/total)
- Collects errors and failures and prints them after all jobs are
finished
Test the conversion of the new and old DepsolveJob given the custom
marshaller.
The deserialised old format is not exactly the same as it would have
been before, but it is functionally equivalent, with the added benefit
of supporting depsolve jobs where we don't want base repositories to be
used by all depsolves.
Add custom marshaller for DepsolveJob that serialises the struct into a
format compatible with both the new and old formats. The format on the
wire is a superset of both the new and old format and can be
deserialised into either while retaining all information.
Move package set chain collation to the distro package and add
repositories to the package sets while returning the package sets from
their source, i.e., the ImageType.PackageSets() method.
This also removes the concept of "base repositories". There are no
longer repositories that are added implicitly to all package sets but
instead each package set needs to specify *all* the repositories it will
be depsolved against.
This paves the way for the requirement we have for building RHEL 7
images with a RHEL 8 build root. The build root package set has to be
depsolved against RHEL 8 repositories without any "base repos" included.
This is now possible since package sets and repositories are explicitly
associated from the start and there is no implicit global repository
set.
The change requires adding a list of PackageSet names to the core
rpmmd.RepoConfig. In the cloud API, repositories that are limited to
specific package sets already contain the correct package set names and
these are now copied to the internal RepoConfig when converting types in
genRepoConfig().
The user-specified repositories are only associated with the payload
package sets like before.
Run unit tests in GitHub workflows in a Fedora container to enable the
dnf-json tests. Run the tests alone with the `force-dnf` flag to make
sure the tests pass and are not skipped.
Install Go using dnf instead of the GH action. The action seems to
cause issues with the $PATH.
Use the registry.fedoraproject.org container for both unit tests and
pylint on dnf-json.
Requires some reordering of the steps in each workflow and the addition
of `git-core` as a dependency.
Using Fedora 35 instead of latest because of changes in the go build
tool: The new -buildvcs flag causes issues on GitHub actions.
On systems where `dnf` and the Python module aren't available, skip the
unit tests that call into the `dnf-json` script.
A test flag, `-force-dnf` is added to avoid this check and run the tests
unconditionally. This is useful for cases where the sniff check might
fail for wrong reasons or, more importantly, for cases where we want to
be sure the tests are ran and consider a missing `dnf` module to be an
error state (e.g., in CI).