The idea is that source can themselves spawn other modules, esp.
new secrets modules. For this they need to know the library dir,
aka 'libdir' throughout the osbuild source. Therefore change the
SourceServer to directly get the library directory instead of
just the sub-directory to the sources. Then pass the library
directory to via the JSON API to the source.
Adjust all usage of the SourceServer, including the tests.
This changes the sources module to explicitly cleanup event-loops.
Additionally, the implementation is protected against re-entrency which
we do not support (and do not need).
We did occasionally get the following exception when running
source-servers:
/usr/lib/python3.8/asyncio/base_events.py:654: ResourceWarning: unclosed event loop <_UnixSelectorEventLoop running=False closed=False debug=False>
_warn(f"unclosed event loop {self!r}", ResourceWarning, source=self)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
Exception ignored in: <function BaseEventLoop.__del__ at 0x7f92589d14c0>
Traceback (most recent call last):
File "/usr/lib/python3.8/asyncio/base_events.py", line 656, in __del__
self.close()
File "/usr/lib/python3.8/asyncio/unix_events.py", line 58, in close
super().close()
File "/usr/lib/python3.8/asyncio/selector_events.py", line 92, in close
self._close_self_pipe()
File "/usr/lib/python3.8/asyncio/selector_events.py", line 99, in _close_self_pipe
self._remove_reader(self._ssock.fileno())
File "/usr/lib/python3.8/asyncio/selector_events.py", line 274, in _remove_reader
key = self._selector.get_key(fd)
File "/usr/lib/python3.8/selectors.py", line 190, in get_key
return mapping[fileobj]
File "/usr/lib/python3.8/selectors.py", line 71, in __getitem__
fd = self._selector._fileobj_lookup(fileobj)
File "/usr/lib/python3.8/selectors.py", line 225, in _fileobj_lookup
return _fileobj_to_fd(fileobj)
File "/usr/lib/python3.8/selectors.py", line 42, in _fileobj_to_fd
raise ValueError("Invalid file descriptor: {}".format(fd))
ValueError: Invalid file descriptor: -1
This is triggered when an event-loop is not closed explicitly via
`event_loop.close()`. It then tries to cleanup explicitly. The problem
here is that python has no knowledge of in which order it should
collect GC'ed objects. This might end up more or less random. Therefore,
file-descriptors might be closed in arbitrary order, leading to the
event-loop being unable to unregister its internal objects.
I am not entirely sure whether this is the case here. However, the error
definitely triggers on the internal event-loop socketpair, which there
is no other external access to. Furthermore, this socketpair is only set
to -1 in its own __del__ function. So unless we have a memory
corruption, I see nothing else that could trigger this.
With this fix in place, I can run `test_sources.py` in a loop without
triggering the bug.
It is quite likely that our other `*Server` classes need the same fix. I
did not verify, yet.
This source adds support for downloaded files. The files are
indexed by their content hash, and the only option is their URL.
The main usecase for this will be downloading rpms. Allowing depsolving
to be done outside of osbuild, network access to be restricted and
downloaded rpms to be reused between runs.
Each source is now passed two additional arguments, a cache directory
and an output directory. Both are in the source's namespace, and
the source is responsible for managing them. Each directory may
contain contents from previous runs, but neither is ever guaranteed
to do so.
Downloaded contents may be saved to the cache and resued between
runs, and the requested content should be written to the output dir.
If secrets are used, the source must only ever write contents to
the output that corresponds to the available secrets (rather than
contents from the cache from previous runs).
Each stage is passed an additional argument, a sources directory.
The directory is read-only, and contains a subdirectory named after
each used source, which will contain the requseted contents when
the `Get()` call returns (if the source uses this functionality).
Based on a patch by Lars Karlitski.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Add a new command line option `--secrets`, which accepts a JSON file
that is structured similarly to a source file. It is should contain data
that is necessary to fetch content, but shouldn't appear in any logs.
This might (hopefully) fix a race in destructing the asyncio.EventLoop
that's used in all API classes, which leads to warnings about unhandled
exceptions on CI.
This also puts their creation closer to where the client-side sockets
are created.
Pipelines encode which source content they need in the form of
repository metadata checksums (or rpm checksums). In addition, they
encode where they fetch that source content from in the form of URLs.
This is overly specific and doesn't have to be in the pipeline's hash:
the checksum is enough to specify an image.
In practice, this precluded using alternative ways of getting at source
packages, such as local mirrors, which could speed up development.
Introduce a new osbuild API: sources. With it, a stage can query for a
way to fetch source content based on checksums.
The first such source is `org.osbuild.dnf`, which returns repository
configuration for a metadata checksum. Note that the dnf stage continues
to verify that the content it received matches the checksum it expects.
Sources are implemented as programs, living in a `sources` directory.
They are run on the host (i.e., uncontained) right now. Each source gets
passed options, which are taken from a new command line argument to
osbuild, and an array of checksums for which to return content.
This API is only available to stages right now.