Commit graph

16 commits

Author SHA1 Message Date
Michael Vogt
0abdfb9041 jsoncomm: transparently handle huge messages via fds
The existing jsoncomm is a work of beautiy. For very big arguments
however the used `SOCK_SEQPACKET` hits the limitations of the
kernel network buffer size (see also [0]). This lead to various
workarounds in #824,#1331,#1836 where parts of the request are
encoded as part of the json method call and parts are done via
a side-channel via fd-passing.

This commit changes the code so that the fd channel is automatically
and transparently created and the workarounds are removed. A test
is added that ensures that very big messages can be passed.

[0] https://github.com/osbuild/osbuild/pull/1833
2024-09-17 19:27:03 +02:00
Michael Vogt
88c35ea306 osbuild: make inputs map() function use fd for reply as well
We recently hit the issue that `osbuild` crashed with:
```
Unable to decode response body "Traceback (most recent call last):
  File \"/usr/bin/osbuild\", line 33, in <module>
    sys.exit(load_entry_point('osbuild==124', 'console_scripts', 'osbuild')())
  File \"/usr/lib/python3.9/site-packages/osbuild/main_cli.py\", line 181, in osbuild_cli
    r = manifest.build(
  File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 477, in build
    res = pl.run(store, monitor, libdir, debug_break, stage_timeout)
  File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 376, in run
    results = self.build_stages(store,
  File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 348, in build_stages
    r = stage.run(tree,
  File \"/usr/lib/python3.9/site-packages/osbuild/pipeline.py\", line 213, in run
    data = ipmgr.map(ip, store)
  File \"/usr/lib/python3.9/site-packages/osbuild/inputs.py\", line 94, in map
    reply, _ = client.call_with_fds(\"map\", {}, fds)
  File \"/usr/lib/python3.9/site-packages/osbuild/host.py\", line 373, in call_with_fds
    kind, data = self.protocol.decode_message(ret)
  File \"/usr/lib/python3.9/site-packages/osbuild/host.py\", line 83, in decode_message
    raise ProtocolError(\"message empty\")
osbuild.host.ProtocolError: message empty
cannot run osbuild: exit status 1" into osbuild result: invalid character 'T' looking for beginning of value
...
input/packages (org.osbuild.files): Traceback (most recent call last):
input/packages (org.osbuild.files):   File "/usr/lib/osbuild/inputs/org.osbuild.files", line 226, in <module>
input/packages (org.osbuild.files):     main()
input/packages (org.osbuild.files):   File "/usr/lib/osbuild/inputs/org.osbuild.files", line 222, in main
input/packages (org.osbuild.files):     service.main()
input/packages (org.osbuild.files):   File "/usr/lib/python3.11/site-packages/osbuild/host.py", line 250, in main
input/packages (org.osbuild.files):     self.serve()
input/packages (org.osbuild.files):   File "/usr/lib/python3.11/site-packages/osbuild/host.py", line 284, in serve
input/packages (org.osbuild.files):     self.sock.send(reply, fds=reply_fds)
input/packages (org.osbuild.files):   File "/usr/lib/python3.11/site-packages/osbuild/util/jsoncomm.py", line 407, in send
input/packages (org.osbuild.files):     n = self._socket.sendmsg([serialized], cmsg, 0)
input/packages (org.osbuild.files):         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
input/packages (org.osbuild.files): OSError: [Errno 90] Message too long
```

The underlying issue is that the reply of the `map()` call is too
big for the buffer that `jsoncomm` uses. This problem existed before
for the args of map and was fixed by introducing a temporary file
in https://github.com/osbuild/osbuild/pull/1331 (and similarly
before in https://github.com/osbuild/osbuild/pull/824).

This commit writes the return values also into a file. This should
fix the crash above and make the function more symetrical as well.

Alternative/complementary version of
https://github.com/osbuild/osbuild/pull/1833

Closes: HMS-4537
2024-08-13 13:13:24 +02:00
Michael Vogt
f5d6d11f1d osbuild: error when {Device,Mount} is modified after creation
This is a drive-by change after spending some quality time with the
mount code. The `id` field of `Mount` is calculated only once and
only when creating a `Mount`. This seems slightly dangerous as
any change to an attribute after creation will not update the
id. This means two options:
1. dynamically update the `id` on changes
2. forbid changes after the `id` is calculcated

I went with (2) but happy to discuss of course but it seems more
the spirit of the class.

It also does the same change for "devices.Device"
2024-01-19 02:54:26 +01:00
Ondřej Budai
c90b587dcc inputs: Move arguments for InputService.map to a temporary file
Prior this commit, the arguments for the input service were passed inline.
However, jsoncomm uses the SOCK_SEQPACKET socket type underneath that has
a fixed maximum packet size. On my system, it's 212960 bytes. Unfortunately,
that's not enough for big inputs (e.g. when building packages with a lot
of rpms).

This commit moves all arguments to a temporary file. Then, just a file
descriptor is sent. Thus, we are now able to send arbitrarily sized args
for inputs, making osbuild work even for large image builds.
2023-06-27 10:56:10 +02:00
Simon de Vlieger
ea6085fae6 osbuild: run isort on all files 2022-09-12 13:32:51 +02:00
Simon de Vlieger
3fd864e5a9 osbuild: fix optional-types
Optional types were provided in places but were not always correct. Add
mypy checking and fix those that fail(ed).
2022-07-13 17:31:37 +02:00
Christian Kellner
7eb58ea348 inputs: introduce new input manager class
Introduce a new class to manage inputs, `InputManger` and move the
code to map inputs from the `Input` here. The main insight of why
the logic should be place here is that certain information is needed
to map inputs, independently of specific type: the path to the input
directory, `root`, the store API, `storeapi` and the service manager
instance to start the actual service. Instead of passing all this
information again and again to the `Input` class, we now have a
specialized (service) manager class for inputs that has all the
needed information all the time.
2022-06-25 02:21:17 +02:00
Christian Kellner
1ed85dc790 inputs: convert to host service
Create a `InputService` class with an abstract method called `map`,
meant to be implemented by all inputs. An `unmap` method may be
optionally overridden by inputs to cleanup resources.
Instantiate a `host.ServiceManager` in the `Stage.run` section and
pass the to the host side input code so it can be used to spawn the
input services.
Convert all existing inputs to the new service framework.
2021-06-09 18:37:47 +01:00
Christian Kellner
08bc9ab7d8 inputs: pre-defined input paths
Instead of bind-mounting each individual input into the container,
create a temporary directory that is used by all inputs and bind-
mount this to the well known location ("/run/osbuild/inputs"). The
temporary directory is then passed to the input so that it can
make the requested resources available relative to that directory.
This is enforced by the common input handling code.
Additionally, pass the well known input path via a new "paths" key
to the arguments dictionary passed to the stage.
2021-06-09 18:37:47 +01:00
Christian Kellner
ef5e9364bb inputs: make inputs aware of their names
The name of the input here refers to its id within the manifest. This
is unique per stage and thus identifies a input for a given stage.
2021-06-09 18:37:47 +01:00
Christian Kellner
8c1a0a2eeb inputs: remove info.name proxy property
This helper property is misleading since it is not the name of the
input in the context of the manifest, but actually "type". Name is
a left-over from the nomenclature of format v1, where the type of
stages and inputs was called `name`.
2021-06-09 18:37:47 +01:00
Christian Kellner
ad8baab819 inputs: remove type info for info param
The `info` parameter for Input constructor is of type `PModuleInfo`,
which is located in `meta`. This in turn imports jsonschema. Ergo,
importing importing `inputs` will create a dependency on jsonschema.
At the same time the `osbuild` package globally imports `Pipeline`,
via `__init__.py`, and `osbuild.api` is used in the runners and
stages, which are run inside the buildroot. If one now wanted to
use `inputs` from `Pipeline`, it would lead to jsonschema being
imported (via `meta`) which might not be available on the build-
root and it is a rather random dependency to have.
On obvious solution would be to use a construct with `TYPE_CHECKING`,
a la:
  if TYPE_CHECKING:
    from .meta import ModuleInfo
Sadly, pylint will now complain about it. This could be fixed with:
  if TYPE_CHECKING:
    from .meta import ModuleInfo
  else:
    ModuleInfo = "osbuild.meta.ModuleInfo"
But this is just gross. So we will have to accept that Python is,
well, Python und omit the type information for the `info` param.
2021-02-06 12:04:30 +01:00
Christian Kellner
eb1d17d8ac input: add references and origin
Currently all options for inputs are totally opaque to osbuild
itself. This is neat from a seperation of concerns point of view
but has one major downside: osbuild can not verify the integrity
of the pipeline graph, i.e. if all inputs that need pipelines or
sources do indeed exists. Therefore intrdouce two generic fields
for inputs: `origin` and `references`. The former can either be
a source or a pipeline. The latter is an array of identifiers or
a dictionary where the keys are the identifiers and the values
are additional options for that id. The identifiers then refer
to either resources obtained via a source or a pipeline that has
already been built.
2021-02-06 12:04:30 +01:00
Christian Kellner
684c408914 inputs: add an id property
Add an `id` property that, like `Stage.id`, can be used to uniquely
identify an input based on its name and options. Two stages with
the same name and options will have the same `id`.
2021-01-22 15:03:19 +01:00
Christian Kellner
9f4861e58f inputs: properly mark name as a property
It was meant to be a property not a function.
2021-01-22 15:03:19 +01:00
Christian Kellner
3bccc82ce9 inputs: introduce new input concept
A pipeline input provides data in various forms to a `Stage`, like
files, OSTree commits or trees. The content can either be obtained
via a `Source` or have been built by a `Pipeline`. Thus an `Input`
is the bridge between various types of content that originate from
different types of sources.

The acceptable origin of the data is determined by the `Input`
itself. What types of input are allowed and required is determined
by the `Stage`.

To osbuild itself this is all transparent. The only data visible to
osbuild is the path. The input options are just passed to the
`Input` as is and the result is forwarded to the `Stage`.
2021-01-18 17:44:46 +01:00