Commit graph

65 commits

Author SHA1 Message Date
Ondrej Ezr
61e6f75281 dnf-json: Allow passing module_hotfixes
Allow passing module_hotfixes as repo option,
it will allow disabling of modularity filtering per repository.
2023-12-20 09:02:06 +01:00
Brian C. Lane
356f261bcd dnf-json: disable pylint warnings
The _dnfrepo is full of branches, turn off warning.

The keyfile doesn't use 'with' because the file needs to remain
available when the function exits. Cleanup of persistdir will clean up
the temporary file used for the key.
2023-02-01 10:27:58 +01:00
Brian C. Lane
c2577eaea8 Add gpgkey and check_repogpg support to dnf-json
This allows verification of repository metadata signatures.

The gpgkeys field is a list of key urls, or the gpg key itself, starting
with '-----BEGIN PGP PUBLIC KEY BLOCK-----'. These will be written to a
temporary file, and that file:// url will be passed to dnf.
2023-02-01 10:27:58 +01:00
Brian C. Lane
2c3cb56cb3 dnf-json: Add search command
Use the DNF query API
(https://dnf.readthedocs.io/en/latest/api_queries.html) to quickly
return results matching a glob pattern. Multiple package glob results
are combined into a single response.

This adds a search dict to the arguments. 'packages' is a list of package
names or globs to search for.
An optional 'latest' boolean will return only the latest NEVRA instead
of all matching builds in the metadata.

eg.

    "search": {
        "latest": false,
        "packages": ["tmux", "vim*", "*ssh*"]
    },
2022-08-23 22:47:46 +01:00
Achilleas Koutsou
d59d870574 dnf-json: fix depsolve error handling
When a DepsolveError exception occurs, the error message would print the
packages in the request.  When the request arguments changed, the error
message handling wasn't updated and would fail to produce the correct
error message.

Compile a list of packages from all transactions and print them in the
error message as a comma-separated list.
2022-06-27 20:41:34 +02:00
Achilleas Koutsou
7a70a5e69b dnfjson: drop repo checksums
The repository checksums in the response from dnf-json aren't used
anywhere.  Since we're making changes to dnf-json and depsolving, now is
a good opportunity to drop them completely.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
1c4d8f9988 dnfjson: use repo config hash as repo ID
Defined a Hash() method on rpmmd.RepoConfig that calculates a SHA-256 ID
for a repository based on its configuration.  Identical configurations
should produce the same ID.  The Name and ImageTypeTags of a repository
aren't taken into account.  These attributes affect a repository's
functional configuration.

This ID lets us change the way we handle repository configurations in a
few places:
- Preparing the depsolve job arguments is simpler since we have
  predictable IDs for the repository configurations.  We don't need to
  rely on the index of a RepoConfig in a list to identify or access it,
  which prevented us from building a list of all repository
  configurations, since we needed them to be placed in the list in a
  certain order.
- Associating packages from the depsolve result with the repository
  configuration (in depsToRPMMD) no longer relies on an ID string
  converted from and back to an integer index.  Repositories define
  their own IDs.
- Tests are a bit messier now but the changes simplify the main code, so
  it's an acceptable trade-off.
    - Fixtures need to change based on the repository configuration for
      the test.
    - We need to calculate the ID for the repository configuration for
      the temporary file server URL.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
46e4f0cf5e dnf-json: don't print success messages
They just make noise.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
61d7c465af dnfjson: remove single Depsolve function and command
Remove the single Depsolve function from the dnfjson package and the
depsolve command from the dnf-json tool.  The new ChainDepsolve
functions and chain-depsolve command can handle single depsolves in the
same way so there's no need to keep (and have to maintain) two versions
of very similar code.

The ChainDepsolve function (in Go) and chain-depsolve command (in
Python) have been renamed to plain Depsolve and depsolve respectively,
since they are now general purpose depsolve functions.
2022-06-01 11:36:52 +01:00
Achilleas Koutsou
82007dcf46 dnf-json: convert to single-use depsolve script
- Removed server class and handlers
    The dnf-json Python script will no longer run as a service.  In the
    future, we will create a service in the Go package that will handle
    receiving requests and calling the script accordingly.
- Removed CacheState class
- Added standalone functions for setting up cache and running the
  depsolve
- Validate the input before reading
- Print all messages (status and error) to stderr and print only the
  machine-readable results to stdout (including structured error)
    The status messages on stderr are useful for troubleshooting.  When
    called from the service they will appear in the log/journal.
- Catch RepoError exceptions
    This can occur when dnf fails to load the repository configuration.
- Support multiple depsolve jobs per request
    The structure is changed to support making multiple depsolve
    requests but reuse the dnf.Base object to make chained (incremental)
    dependency resolution requests.

Before:
{
  "command": "depsolve",
    "arguments": {
      "package-specs": [...],
      "exclude-specs": [...],
      "repos": [{...}],
      "cachedir": "...",
      "module_platform_id": "...",
      "arch": "..."
    }
}

After:
{
  "command": "depsolve",
  "cachedir": "...",
  "module_platform_id": "...",
  "arch": "...",
  "arguments": {
    "repos": [{...}],
    "transactions": [
      {
        "package-specs": [...],
        "exclude-specs": [...],
        "repo-ids": [...]
      }
    ]
  }
}

Signed-off-by: Achilleas Koutsou <achilleas@koutsou.net>
2022-06-01 11:36:52 +01:00
Tomas Hozza
d48da99a12 rpmmd/dnf-json: support chain dependency solving
Add a new `rpmmdImpl` method `chainDepsolve`, which is able to
depsolve multiple chained package sets as separate DNF transactions
layered on top of each other.

This new method allows to depsolve the `blueprint` package set on top of
the base image package set (usually called `packages`).

Introduce a helper function `chainPackageSets` for constructing
arguments to the `chainDepsolve` method based on the provided arguments:
 - slice of package set names to chain as transactions
 - map of package sets
 - slice of system repositories used by all package sets
 - map of package-set-specific repositories

Extend `dnf-json` with a new command `chain-depsolve` allowing to
depsolve multiple transaction in a row, layered on top of each other.

Add unit tests where appropriate.
2022-04-28 14:42:49 +02:00
Tom Gundersen
2a4d4c4d49 dnf-json: use the default connection timeout
By default `timeout` is 30 seconds, but we had it set to 5. Drop
the override and use the default.

This has two effects: it increases the time before we give up on
connecting (as it says on the tin), and it also increases the time
download has to be slow for before we give up.

Internally, we were seing failures in downlaoding metadata from ODCS
and similar issues have occurred in CI too.

The potential downside to this is in case of having several mirrors
this means it takes longer before giving up on a bad one and trying
a better one. But slow is better than broken, so for now rever to
the default behavior.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2022-03-12 09:09:13 +01:00
Tomas Hozza
180290d016 dnf-json: use repository name from the request if provided
Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-03-12 08:36:40 +01:00
Tomas Hozza
f9d0412316 dnf-json: do not use reponame as repoid.
Repo name defaults to the repo ID if the name is not set. `dnf-json`
should not rely on the `reponame` being set to the ID and intsead
return the actual `repoid`.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
2022-03-12 08:36:40 +01:00
Achilleas Koutsou
7267fec608 dnf-json: disable some pylint checks
invalid-name: script name is "unpythonic" since it contains a -, but
that's fine.
too-many-arguments: also fine.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
74d8a1a462 dnf-json: add __init__ for DnfJsonRequestHandler to define cache_dir
Add a small __init__ for our subclass to define our one custom
attribute.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
b34150be6e dnf-json: fix small type mismatch in null value assignment 2022-03-08 12:42:12 +01:00
Achilleas Koutsou
7346171bd2 dnf-json: staticify methods that don't need to be instance methods
These two methods don't rely on the object instance at all so they
should be static.
The _timestamp_to_rfc() method can be a one-liner.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
df935627c4 dnf-json: codestyle: whitespace and blank line fixes
Whitespace around operators and after commas.
No whitespace after opening and before closing brackets.
Two blank lines between top-level functions and classes.
One blank line between class methods.
Indentation fixes.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
447df031dd dnf-json: CacheState factory as classmethod
In this case it might be functionally equivalent, but it's generally
nicer to have factory methods as class methods.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
3268c1f28f dnf-json: shorten CacheState loading and saving method names
CacheState.load_cache_state_from_disk() is long and redundant.
CacheState.store_on_disk() is fine (and load_from_disk() would also be
fine) but in the absence of any other store/load sources, the
from_disk() part is also unnecessary.
CacheState.store() and CacheState.load() should be enough.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
43a90ed473 dnf-json: remove mutable default argument value
Mutable values should not be used as default function arguments.
2022-03-08 12:42:12 +01:00
Achilleas Koutsou
1b86423d67 dnf-json: import cleanup
Removed unused imports: pathlib, queue, and datetime
Reorganised imports into 3 sections:
1. stdlib modules
2. stdlib submodule
3. foreign modules

Each section is sorted alphabetically.
2022-03-08 12:42:12 +01:00
Djebran Lezzoum
d8fdb03373 dnf-json: Add repo_id to dump package.
The dump function will be used to search packages and as we are implementing third party repositories, adding the repo_id of the package will allow us to identify the package repository used.
2022-02-21 15:46:53 +01:00
sanne
fe00e1efd3 containers/osbuild-composer: Allow dnf-json to accept http connections
Revert 83e16afda4: With dnf-json running
in a container it's easy to run it standalone.
2022-02-02 11:15:46 +01:00
Thomas Lavocat
bcf34f8c6c dnj-json: delete unused cache folders
Detect folders that are not used since some timeout and delete them.
The cache folder must be empty when dnf-json is started in order to
avoid the situation where some folders can never be cleaned up (dnf-json
does not look at the cache directory content but uses information from
the requests to deduce which folders to keep and to delete).

Solves #2020
2022-01-03 16:00:38 +01:00
sanne
83e16afda4 dnf-json: Can be started without systemd
Instead of starting the socket in the entrypoint, make dnf-json able to
bind on the unixsocket by itself.
2021-12-15 09:41:32 +01:00
Thomas Lavocat
0877ae3ac0 dnf-json: Avoid leaking memory on the Cpp side
To avoid dnf leaking memory, dnf-json as a service calls fork() on each
request. This allow memory to be freed automatically when the process
handling the request is destroyed.
2021-12-15 09:41:32 +01:00
Thomas Lavocat
f8281eee54 dnf-json: refactor
Prepare the multi-cache architecture by doing some refactoring.
Mainly this commit adds a solver class that embeds all the logic around
dnf. Responsibilities of communicating on the socket and depsolving are
separated.
2021-12-15 09:41:32 +01:00
Thomas Lavocat
ca126e9747 dnf-json: Change dnf-json to be a daemon
The service is started via systemd activation sockets.
The service serves http POST requests, the same json as before is
expected as the body of the request, and the same json as before is sent
as the response of the request.
2021-12-15 09:41:32 +01:00
Tom Gundersen
e76543d779 dnf-json: expire metadata by default
Never expiring metadata by default leads to surprising behavior
especially for our long-running services. The overhead of expiration
is small but noticeable, attempt some compromise.

This should all be revisited to make dnf-json handle caches better
and be more performant.
2021-10-04 16:02:31 +02:00
Ondřej Budai
4f8dc76ca7 dnf-json: disable zchunk
See the comment

Signed-off-by: Ondřej Budai <ondrej@budai.cz>
2021-10-01 15:23:53 +02:00
Lars Karlitski
b5bd00d739 dnf-json: don't initialize dnf plugins
acf91a4 enabled fastestmirror but also calls `base.init_plugins()` to
initialize dnf plugins. This is not necessary and not what we want
conceptually.

Not necessary, because `fastestmirror` is a dnf built-in (it was a
plugin during yum-times [1]). The same patch sets the `fastestmirror`
option as well. Thus, this patch does not revert functionality.

Not what we want, because we're using dnf more as a library, explicitly
passing all options. Plugins depend on additional host configuration,
which we'd like to avoid pulling in. In particular, the
subscription-manager plugin tries reading certificates in `/etc/pki`,
which are not readable by the `osbuild-composer` user. This leads to
these errors in the journal:

    [ERROR] dnf-json:54297:MainThread @logutil.py:194 -
      [Errno 13] Permission denied: '/var/log/rhsm/rhsm.log' -
      Further logging output will be written to stderr
    [ERROR] dnf-json:54297:MainThread @identity.py:156 -
      Reload of consumer identity cert /etc/pki/consumer/cert.pem
      raised an exception with msg:
      [Errno 13] Permission denied: '/etc/pki/consumer/key.pem'

These errors are not fatal, but could confuse people when inspecting
logs to find unrelated problems. This patch makes them disappear.

[1] https://fedoraproject.org/wiki/Yum_to_DNF_Cheatsheet
2020-08-23 16:08:25 +02:00
Major Hayden
84022a7889 dnf-json: flake8 cleanup
Signed-off-by: Major Hayden <major@redhat.com>
2020-07-10 12:20:02 -05:00
Major Hayden
acf91a4e54 🏃 Enable fastestmirror in dnf-json
The time it takes to depsolve a blueprint varies widely depending on
where the job is running and which mirrors are randomly chosen based on
the data returned in the metalink XML.

Use dnf's fastestmirror plugin to choose the fastest mirror for
downloading metadata. This returns consistent results in PSI + AWS and
every depsolve completed in under 60 seconds after 25 tests in each
cloud.

Fixes #845.

Signed-off-by: Major Hayden <major@redhat.com>
2020-07-10 12:20:02 -05:00
Martin Sehnoutka
607b4ed935 dnf-json: change confusing error message associated with dnf errors
The issue was introduced in 0d3c8329c0.
The patch correctly changed the base exception class, but it didn't
change the unfortunate use of hardcoded type name. This patch uses
Python's internal `__name__` attribute to get the type (exception) name.
2020-06-26 20:36:35 +02:00
Jacob Kozol
8750dc467b dnf-json: add ssl certs to repo
If a repo passed to dnf-json contains an sslcacert, sslclientkey, or
sslclientcert then dnf-json will include those values in that repo in
the dnf base.
2020-05-28 00:23:54 +02:00
Tom Gundersen
bb85acf36f dnf-json: set metadata_expire
We were using dnf's default of 48h, but that does not work for
updates repositories, as they depend on an expiration time of 6h.

Allow the metadata_expire value to be configured per repository.
If the value is unset, then never expire the metadata. Set the
value to 6h for all the fedora testing repos.

This fixes issue #476.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-04-11 19:14:02 +02:00
Lars Karlitski
0d3c8329c0 dnf-json: change base error type
Even though `dnf.exceptions.RepoError` is documented as the base error,
`dnf.exceptions.Error` is actually the base error (and also documented
as such).
2020-04-07 14:40:12 +02:00
Martin Sehnoutka
7e69299259 dnf-json: allow passing arch as an argument
dnf can do cross-arch depsolving, but in case repositories for multiple
arches are provided and it is running on a different architecture than
the image build requires, it can lead to errors.

This patch makes sure that we only include packages from the
architecture we want.

It uses substitutions as defined in dnf documentation:
https://dnf.readthedocs.io/en/latest/api_conf.html#dnf.conf.Conf.substitutions
Unfortunately the docs are very sparse on details and the Fedora docs
are not updated any more:
https://docs.fedoraproject.org/en-US/Fedora/26/html/System_Administrators_Guide/sec-Using_DNF_Variables.html
Also dnf team is migrating the code to libdnf:
https://github.com/rpm-software-management/libdnf
which does not yet have any documentation.
2020-03-24 20:45:30 +01:00
Brian C. Lane
8f8187061f dnf-json: Return an error when repo setup fails 2020-03-18 20:42:09 +01:00
Tom Gundersen
5d179428be rpmmd: drop the Name attribute from RepoConfig
This was never actually used anywhere, as passing it to dnf-json
was a noop.

We may want to reconsider the concept of a source/repo name and
how it differs from an ID, but for now drop the name.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-15 23:48:42 +01:00
Tom Gundersen
7ea74cd131 dnf-json: pass back the repo_id and the relativepath of each package
This will eventually replace the remote_location property. The latter
pins a specific location (a specific mirror), but the two former
can together be used to re-resolve to a more suitable mirror at the
time/place the package will actually be downloaded.

Rather than pinning mirrors in the osbuild manifests, we want to be
able to include the metalink and relative locations so each worker
can use mirrors closer to them.

This would be particularly important when pipelines are rebuilt in
the future, and the best mirrors may have changed.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-15 23:48:42 +01:00
Tom Gundersen
819430e659 rpmmd: no longer flush the caches on every call
When we used the dnf-based pipelines, we were relying on the fact
that the metadata was unlikely to have changed between we generated
the pipeline and called osbuild. We achieved this by always updating
to the most recent metadata on every call to rpmmd.Depsolve that
would end up in a pipelin.

Refreshing the metadata is time-consuming, and something we want
to avoid if at all possible. Now that our pipelines no longer
rely on this property, we can drop the flushing.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-15 19:38:59 +01:00
Tom Gundersen
0b3d2be698 dnf-json: avoid randomizing package order
We want depsolving via dnf-json, followed by rpm installation to be
the same as installing directly with dnf. However, the `install_set()`
helper we used inserts the list of packgaes into a set internally
before returning it to us to iterate. Set order iteration is not
a FIFO in python, and because the order of package installation
in rpm is only a partial order, we ended up with different images
depending on whether we installed through dnf or dircetly via rpm.

To avoid the indirection via a set, open-code `install_set()` without
the intermediate allocation.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-02 17:44:36 +01:00
Tom Gundersen
1ce84a5eff dnf-json: mark as executable
Allow this helper to be used easily in other scripts.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-03-02 17:44:36 +01:00
Tom Gundersen
44c03cf61e dnf-json: make cachedir mandatory
Without passing in a cachedir, dnf would create a random one for every
invocation. This meant that caches were never reused, nor cleaned up
properly.

Let systemd create a cache directory for us in /var/cache/ and use
that via the environment variable systemd sets for us.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-20 15:26:54 +01:00
Tom Gundersen
cdd1912e78 dnf-json: make independent from the host
We must avoid depending on the host's state in any way. This achieves
isolation in the following ways:
 - rather than the default config file /dev/null is used
 - rather than sharing the host persistent state dir a temporary one
   is used and thrown away for each call
 - the module_platform_id is set explicitly per supported distro, rather
   than taken from /etc/os-release.

Optionally, the cache directory can be configured, as we may want to keep
this separate from the host, if for no other reason than accounting.
However, the cache appears to be well-behaved, so we can keep sharing
it between calls (or even with the host). This speeds up things
considerably, so this is definitely what we want.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-14 14:43:27 +01:00
Tom Gundersen
b6d9268810 dnf-json: support excluding packages
In our base distro definitions we exclude packages in addition to
including them. Extend dnf-json to support this, so we can depsolve
the base package set as well as the packages added in blueprints.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-14 14:43:27 +01:00
Tom Gundersen
b4bb73a195 dnf-json: expose each RPM location and content hash
In adition to the NEVRA, include the location and hash over the rpm
file. This allows us to separately fetch and verify that refernces
to RPMs are correct, as the NEVRA alone is not sufficient for fetching
nor verifying.

This is a prerequisite for using the rpm rather than the dnf stage
in our osbuild pipelines.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2020-02-14 14:43:27 +01:00