The existing jsoncomm is a work of beautiy. For very big arguments
however the used `SOCK_SEQPACKET` hits the limitations of the
kernel network buffer size (see also [0]). This lead to various
workarounds in #824,#1331,#1836 where parts of the request are
encoded as part of the json method call and parts are done via
a side-channel via fd-passing.
This commit changes the code so that the fd channel is automatically
and transparently created and the workarounds are removed. A test
is added that ensures that very big messages can be passed.
[0] https://github.com/osbuild/osbuild/pull/1833
We recently hit the issue that `osbuild` crashed with:
```
Unable to decode response body "Traceback (most recent call last):
File \"/usr/bin/osbuild\", line 33, in <module>
sys.exit(load_entry_point('osbuild==124', 'console_scripts', 'osbuild')())
File \"/usr/lib/python3.9/site-packages/osbuild/main_cli.py\", line 181, in osbuild_cli
r = manifest.build(
File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 477, in build
res = pl.run(store, monitor, libdir, debug_break, stage_timeout)
File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 376, in run
results = self.build_stages(store,
File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 348, in build_stages
r = stage.run(tree,
File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 213, in run
data = ipmgr.map(ip, store)
File \"/usr/lib/python3.9/site-packages/osbuild/inputs.py\", line 94, in map
reply, _ = client.call_with_fds(\"map\", {}, fds)
File \"/usr/lib/python3.9/site-packages/osbuild/host.py\", line 373, in call_with_fds
kind, data = self.protocol.decode_message(ret)
File \"/usr/lib/python3.9/site-packages/osbuild/host.py\", line 83, in decode_message
raise ProtocolError(\"message empty\")
osbuild.host.ProtocolError: message empty
cannot run osbuild: exit status 1" into osbuild result: invalid character 'T' looking for beginning of value
...
input/packages (org.osbuild.files): Traceback (most recent call last):
input/packages (org.osbuild.files): File "/usr/lib/osbuild/inputs/org.osbuild.files", line 226, in <module>
input/packages (org.osbuild.files): main()
input/packages (org.osbuild.files): File "/usr/lib/osbuild/inputs/org.osbuild.files", line 222, in main
input/packages (org.osbuild.files): service.main()
input/packages (org.osbuild.files): File "/usr/lib/python3.11/site-packages/osbuild/host.py", line 250, in main
input/packages (org.osbuild.files): self.serve()
input/packages (org.osbuild.files): File "/usr/lib/python3.11/site-packages/osbuild/host.py", line 284, in serve
input/packages (org.osbuild.files): self.sock.send(reply, fds=reply_fds)
input/packages (org.osbuild.files): File "/usr/lib/python3.11/site-packages/osbuild/util/jsoncomm.py", line 407, in send
input/packages (org.osbuild.files): n = self._socket.sendmsg([serialized], cmsg, 0)
input/packages (org.osbuild.files): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
input/packages (org.osbuild.files): OSError: [Errno 90] Message too long
```
The underlying issue is that the reply of the `map()` call is too
big for the buffer that `jsoncomm` uses. This problem existed before
for the args of map and was fixed by introducing a temporary file
in https://github.com/osbuild/osbuild/pull/1331 (and similarly
before in https://github.com/osbuild/osbuild/pull/824).
This commit writes the return values also into a file. This should
fix the crash above and make the function more symetrical as well.
Alternative/complementary version of
https://github.com/osbuild/osbuild/pull/1833
Closes: HMS-4537
This is a drive-by change after spending some quality time with the
mount code. The `id` field of `Mount` is calculated only once and
only when creating a `Mount`. This seems slightly dangerous as
any change to an attribute after creation will not update the
id. This means two options:
1. dynamically update the `id` on changes
2. forbid changes after the `id` is calculcated
I went with (2) but happy to discuss of course but it seems more
the spirit of the class.
It also does the same change for "devices.Device"
Prior this commit, the arguments for the input service were passed inline.
However, jsoncomm uses the SOCK_SEQPACKET socket type underneath that has
a fixed maximum packet size. On my system, it's 212960 bytes. Unfortunately,
that's not enough for big inputs (e.g. when building packages with a lot
of rpms).
This commit moves all arguments to a temporary file. Then, just a file
descriptor is sent. Thus, we are now able to send arbitrarily sized args
for inputs, making osbuild work even for large image builds.
Introduce a new class to manage inputs, `InputManger` and move the
code to map inputs from the `Input` here. The main insight of why
the logic should be place here is that certain information is needed
to map inputs, independently of specific type: the path to the input
directory, `root`, the store API, `storeapi` and the service manager
instance to start the actual service. Instead of passing all this
information again and again to the `Input` class, we now have a
specialized (service) manager class for inputs that has all the
needed information all the time.
Create a `InputService` class with an abstract method called `map`,
meant to be implemented by all inputs. An `unmap` method may be
optionally overridden by inputs to cleanup resources.
Instantiate a `host.ServiceManager` in the `Stage.run` section and
pass the to the host side input code so it can be used to spawn the
input services.
Convert all existing inputs to the new service framework.
Instead of bind-mounting each individual input into the container,
create a temporary directory that is used by all inputs and bind-
mount this to the well known location ("/run/osbuild/inputs"). The
temporary directory is then passed to the input so that it can
make the requested resources available relative to that directory.
This is enforced by the common input handling code.
Additionally, pass the well known input path via a new "paths" key
to the arguments dictionary passed to the stage.
This helper property is misleading since it is not the name of the
input in the context of the manifest, but actually "type". Name is
a left-over from the nomenclature of format v1, where the type of
stages and inputs was called `name`.
The `info` parameter for Input constructor is of type `PModuleInfo`,
which is located in `meta`. This in turn imports jsonschema. Ergo,
importing importing `inputs` will create a dependency on jsonschema.
At the same time the `osbuild` package globally imports `Pipeline`,
via `__init__.py`, and `osbuild.api` is used in the runners and
stages, which are run inside the buildroot. If one now wanted to
use `inputs` from `Pipeline`, it would lead to jsonschema being
imported (via `meta`) which might not be available on the build-
root and it is a rather random dependency to have.
On obvious solution would be to use a construct with `TYPE_CHECKING`,
a la:
if TYPE_CHECKING:
from .meta import ModuleInfo
Sadly, pylint will now complain about it. This could be fixed with:
if TYPE_CHECKING:
from .meta import ModuleInfo
else:
ModuleInfo = "osbuild.meta.ModuleInfo"
But this is just gross. So we will have to accept that Python is,
well, Python und omit the type information for the `info` param.
Currently all options for inputs are totally opaque to osbuild
itself. This is neat from a seperation of concerns point of view
but has one major downside: osbuild can not verify the integrity
of the pipeline graph, i.e. if all inputs that need pipelines or
sources do indeed exists. Therefore intrdouce two generic fields
for inputs: `origin` and `references`. The former can either be
a source or a pipeline. The latter is an array of identifiers or
a dictionary where the keys are the identifiers and the values
are additional options for that id. The identifiers then refer
to either resources obtained via a source or a pipeline that has
already been built.
Add an `id` property that, like `Stage.id`, can be used to uniquely
identify an input based on its name and options. Two stages with
the same name and options will have the same `id`.
A pipeline input provides data in various forms to a `Stage`, like
files, OSTree commits or trees. The content can either be obtained
via a `Source` or have been built by a `Pipeline`. Thus an `Input`
is the bridge between various types of content that originate from
different types of sources.
The acceptable origin of the data is determined by the `Input`
itself. What types of input are allowed and required is determined
by the `Stage`.
To osbuild itself this is all transparent. The only data visible to
osbuild is the path. The input options are just passed to the
`Input` as is and the result is forwarded to the `Stage`.