Commit graph

310 commits

Author SHA1 Message Date
Christian Kellner
d9ae219e19 api: transfer metadata context via fd
Metadata information can easily become very big, like in the case
of the package metadata of the org.osbuild.rpm stage, quite likely
exceeding the configured maximum package length of the underlying
socket. To avoid potential issues here, transfer the actual data
by writing it to a temporary file and sending a open fd over.
2020-10-22 22:47:22 +01:00
Christian Kellner
e919f66609 pipeline: use osrelease.DEFAULT_PATHS
Use the newly defined constant that contains the well known paths
for where to look for `os-release` file.
2020-10-21 11:13:28 +02:00
Christian Kellner
807090f4c8 pipeline: introduce detect_host_runner helper
Extract the existing code that creates the runner for the host
build container into a small helper method, so it can be re-used
in other places, like the tests.
2020-10-21 11:13:28 +02:00
Christian Kellner
3010d247ea util/osrelease: add default os-release paths
Add a new `DEFAULT_PATHS` constant, a list of all well known paths
where `os-release` can be found, as per os-release(5).
2020-10-21 11:13:28 +02:00
Christian Kellner
aaa51e22a6 api: properly serialize the exception's traceback
Use `traceback.print_tb()` to serialize the exceptions' backtrace.
The previously used expression `str(e.__traceback__)` will just
give `<traceback object at 0x…>`, which is not very helpful.
Add a test to check that the method name that raises the exception,
also called `exception`, is in the traceback.
2020-10-09 10:47:44 +02:00
Christian Kellner
f8de164413 api: properly encode exception type
When using `str(type(exception))` this ends up to be something like
`<class 'ValueError'>` for a `ValueError` exception. Get the vanilla
name of the exception type via `type(exception).__name__`.
Add a test to ensure that we encode this properly.
2020-10-09 10:47:44 +02:00
Christian Kellner
f5d00dd043 api: use more generic error member for exceptions
Rename the `API.exception` member to `API.error`, to make it more
generic, so it can also be used for other sort of errors in the
future. Also add a layer of additional structure with `type` and
`data` members so different types of errors apart. Currently only
`exception` is used.
Adapt the tests in test/mod/test_api.py to check for the new
structure and its content.
2020-10-09 10:47:44 +02:00
Christian Kellner
78d72eded9 api: whitespaces fixes
No semantic change, just more spaces between functions to make it
more PEP-8 compliant.
2020-10-09 10:47:44 +02:00
Christian Kellner
cbcb335b3e osbuild: fix spelling mistakes found by codespell
Run codespell on the source ('codespell -f -L msdos -S coverity
-S rpmbuild -S samples') and fix all uncovered mistakes.
2020-10-06 14:41:00 +02:00
Chloe Kaubisch
5dc5ddcf29 api: add exception endpoint
Create a new api endpoint called exception, that communicates
exception backtraces separately back to osbuild, as opposed to
dumping them into the normal log. Additionally, add a corresponding
test to check that a call to api.exception correctly sets
API.exception.
2020-10-02 17:49:45 +02:00
chloenayon
01aae91949 api: remove setup_stdio
API.setup_stdio was replaced in PRs 506 and 507,
remove setup_stdio functions and call sites.
2020-09-09 12:52:50 +02:00
chloenayon
b1229de56e pipeline: unify object exporting
Remove output.export and associated logic in pipeline.assemble.
Instead, return output or None, and export only once in pipeline.run.
2020-09-02 17:54:11 +02:00
Christian Kellner
499ae1654e osbuild: replace api.setup_stdio with BuildRoot
Now that the BuildRoot is capable of capturing the output of the
runner and modules (stages, assemblers), there is no need for
using `api.setup_stdio`. Therefore, drop it from all runners and
replace `api.output` with `BuildRoot.output`, which will contain
the output if `api.setup_stdio` is not called from the runners.
2020-08-31 15:06:36 +02:00
Christian Kellner
10579ee6f5 buildroot: return a new CompletedBuild with output
Create a new CompletedBuild object that wraps and is very similar
to the subprocess.CompletedProcess, i.e. it has a process member
but also has shortcuts for returncode. Additionally, the output
of the process is not only forwarded to the monitor, but also
captured and then handed to CompletedBuild, so its output member
will actually contain the full build output. To be compatible
with the previously returned CompletedProcess, `stderr`, `stdout`
members exist on CompletedBuild that also return `output`.
2020-08-31 15:06:36 +02:00
Christian Kellner
96a5499ed9 buildroot: log bubblewrap's output
In case that bubblewrap fails to, e.g. because it fails to execute
the runner, it will print an error message to stderr. Currently,
this output is not capture and thus not logged. To fix that, the
`BuildRoot.run` method now takes a monitor object and will stream
stdout/stderr to the log via the monitor.
2020-08-27 08:07:14 +02:00
chloenayon
3bf5d26c7a pipeline: replace objectstore logic with get call
In pipeline.run, replace calls to objectstore.contains
and objectstore.new with a call to objectore.get, which
has the same functionality.
2020-08-26 15:10:12 +02:00
chloenayon
35fa429965 objectstore: get returns object not path
Change objectstore.get to return an object or None instead of a path.
2020-08-26 15:10:12 +02:00
Christian Kellner
e273dd0084 api: add 'get-arguments' call and client method
Add a new `get-arguments` API call to fetch the input/arguments.
To avoid running into any limitings on maximum package size on
the socket, the actual data is written to a temp file and a fd
to that passed to the client - very much as in `setup_stdio`.

Additionally, new `arguments` method is provided as a client
counterpart for the new API call.
2020-08-25 18:51:55 +02:00
David Rheinsberg
803433fb62 api: prevent early output retrieval
Change the API endpoint to prevent retrieving monitor-output from a
running instance. Instead, we require the caller to exit the API context
before querying the monitor-output. This guarantees that the api-thread
was synchronously taken down and scheduled any outstanding events.

This fixes an issue where a side-channel notifies us of a buildroot
exit, but the api-thread has not yet returned from epoll, and thus might
not have dispatched pending I/O events, yet. If we instead wait for the
thread to exit, we have a synchronous shutdown and know that all
*ordered* kernel events must have been handled.

In particular, imagine a build-root program running (like `echo` in the
test_monitor unittest) which writes data to the stdout-pipe and then
immediately exits. The syscall-order guarantees that the data is written
to the pipe before the SIGCHLD is sent (or wait(2) returns). However, we
retrieve the SIGCHLD from our main-thread usually (p.join() in our test,
and BuildRoot() in our main code), while the pipe-reading is done from
an API thread. Therefore, we might end up handling the SIGCHLD first
(just imagine a single-threaded CPU that schedules the main task before
the thread). To avoid this race, we can simply synchronize with the
api-thread. Since we already have this synchronization as part of the
api-thread takedown, it is as simple as stopping the api-thread before
continuing with operations.

Lastly, if a write operation to a pipe was issued, we are guaranteed
that a SIGCHLD synchronization across processes is ordered correctly.
Furthermore, the python event-loop also guarantees that stopping an
event-loop will necessarily dispatch all outstanding events. A read is
guaranteed to be outstanding in our race-scenario, so the read will be
dispatched. The only possible problem is `_output_ready()` only
dispatching a maximum of 4096 bytes. This might need to be fixed
separately. A comment is left in place.
2020-08-13 14:02:27 +02:00
Christian Kellner
42b20638c0 pipeline: add metadata to the build result
Include metadata, optionally set by modules, in the build result.
2020-08-13 10:50:34 +02:00
Christian Kellner
fb3a0c5982 api: add support for metadata
Add support for setting metadata via `osbuild.API`. It is meant
to be used by modules (stages, assemblers) to pass additional data
that belong to the result back to osbuild. For this, a new api
method `set-metadata` can be used to set and update a metadata
dictionary on the `osbuild.API` class. A client side method
`metadata` is provided to do so.
2020-08-13 10:50:34 +02:00
Christian Kellner
41cf4bf2d3 buildroot: ensure /sys/fs/selinux is read-only
Make sure "/sys/fs/selinux" is read-only, otherwise libselinux and
tools will assume that SELinux is available and active and in turn
use /sys/fs/selinux to e.g. verify the file systems labels; this
will then prevent setting unknown labels via `setfiles`.
2020-08-12 16:52:27 +02:00
chloenayon
fdaa2e1a66 osbuild: require output_directory
Make the output_directory argument in Pipeline.assemble
and Assembler.run required. The qemu assembler assumes
it is passed in args and will crash without it. Making
it mandatory prevents this.
2020-08-07 20:39:14 +02:00
Major Hayden
c0d71c3fa1 monitor: add assembler/stage duration
Allow a user to see the duration for each step in the osbuild pipeline.

This allows a user to optimize the build system for the best performance
and identify performance bottlenecks.

Signed-off-by: Major Hayden <major@redhat.com>
2020-08-06 16:19:47 +02:00
chloenayon
1e3c0aea1b osbuild: unified libdir handling
Change the default of libdir to /usr/lib/osbuild and
remove redundant logic. Additionally, change how the
python package is detected.

Instead of checking if libdir is None, check if
/usr/lib/osbuild is empty - i.e. if the user has specified
a different directory than the default.
2020-08-04 09:02:22 +02:00
Christian Kellner
c5925fd185 buildroot: unshare the network
Run the container in a new network namespace, to isolate the host's
network from that of the container. Stages, assemblers and the tools
they execute are not supposed to assume network access is available
and this isolation will make sure of that.
2020-07-29 02:16:20 +01:00
Christian Kellner
785f843901 jsoncomm: remove destination from send
Now that jsoncomm.Socket is using a connection-oriented socket,
the destination in `socket.sendmsg` is ignored and thus can and
should be dropped from the `jsoncomm.Socket.send` method.
Adjust the tests accordingly.
2020-07-29 02:16:20 +01:00
Christian Kellner
8388a64ffa api: remove 'addr' param from message dispatcher
Now that jsoncomm is using a connection oriented protocol, the
`addr` parameter is not needed[*] and can thus be removed from
the `BaseAPI._message` message dispatcher. Adapt all usages
of it, including the tests.

[*] sendmsg ignores the destination parameter for connection
oriented sockets.
2020-07-29 02:16:20 +01:00
Christian Kellner
abbdf06ba5 jsoncomm: switch to use sequenced-packet sockets
Switch to use a connection oriented datagram based protocol, i.e.
`SOCK_SEQPACKET`, instead of `SOCK_DGRAM`. It sill preserves
message boundaries, but since it is connection oriented the client
nor the server do not need to specify the destination addresses
of the peer in sendmsg/recvmesg. Moreover, the host will be able
to send messages to the client, even if the latter is sandboxed
with a separate network namespace. In the `SOCK_DRAM` case the
auto-bound address of the client would not be visible to the host
and thus sending messages would to it would fail.

Adapt the jsoncomm tests as well as `BaseAPI`.
2020-07-29 02:16:20 +01:00
Christian Kellner
fcf3ec4502 jsoncomm: add connection oriented methods
Implement `accept` and `listen`, that call the equivalent methods
on the underlying socket; this prepares the move to a connection
oriented socket, i.e. `SOCK_SEQPACKET`.
2020-07-29 02:16:20 +01:00
Christian Kellner
2da98a57d7 jsoncomm: add blocking property to Socket
Add a new `blocking` property to get and set the blocking state
of the underlying socket. In Python this is tied to the timeout
setting of the `socket.Socket`, i.e. non-blocking means having
any timeout specified, including "0" for not waiting at all.
Blocking means having a timeout value of `None`.
The getter is emulating the logic of `Socket.getblocking`, which
was added in 3.7, and we need to stay compatible with 3.6.
The logic is implemented in `Modules/socketmodule.c` in Python.
2020-07-29 02:16:20 +01:00
Christian Kellner
ac03c257b9 jsoncomm: remove unnecessary brackets
FdSet does derive directly and only from `object`. Not specifying
any base classes is the same as specifying an empty list of base
classes; therefore get rid of the empty list.
2020-07-27 12:50:38 +01:00
Christian Kellner
630f73ba01 api: close file descriptor set in _dispatch
Make sure file descriptors are never leaked by closing them after
the `_message` method invocation. Clients that want to hold on to
fds past the scope of the method should use `FdSet.steal` to
extract those.
Adapt the `LoopServer`'s `_message` implementation accordingly.
2020-07-27 12:50:38 +01:00
Christian Kellner
0203dc4ccd buildroot: ensure rundir and vardir exist
The `BuildRoot` wants to create temporary directories in two
locations, `rundir` (supplied as `path`) and `vardir`. Make
sure these directories exist before trying to create temporary
directories in them.
2020-07-27 12:50:38 +01:00
Christian Kellner
b2d406f941 api: declare BaseAPI._message to be abstract
Now that all API providers are converted to use the high level
dispatcher, make the implementation of that mandatory by declaring
it an abstract method.
2020-07-27 12:50:38 +01:00
Christian Kellner
f8514e782c sources: use high level message dispatcher
Use the new `BaseAPI._message` high level message dispatcher that
is more convenient to use.
2020-07-27 12:50:38 +01:00
Christian Kellner
93c1be4c5f remoteloop: use high level message dispatcher
Use the new `BaseAPI._message` high level message dispatcher that
is more convenient to use.
2020-07-27 12:50:38 +01:00
Christian Kellner
ae27e0ccc6 api: use high level message dispatcher in API
Use the new `BaseAPI._message` high level message dispatcher that
is more convenient to use.
2020-07-27 12:50:38 +01:00
Christian Kellner
aa07c5ec82 api: add high level message dispatcher
Availability of new incoming data is indicated to clients, i.e.
deriving classes, by invoking the `_dispatch` method, with the
`jsoncomm.Socket` as argument. All clients then need to call
`Socket.recv` to actually receive the data.
Provide a new high-level message dispatcher class by providing
a standard implementation of `_dispatch` in `BaseAPI` that calls
`socket.revc` and then invokes the new high level `_message`
method, with the data (`msg`), file descriptors (`fds`, if passed)
the socket (`sock`) and the peer address `addr`.
2020-07-27 12:50:38 +01:00
Christian Kellner
aebff47908 buildroot: remove api temporary directory
Now that no API provider is using the temporary directory to place
its sockets in it anymore, this directory can be removed.
2020-07-27 12:50:38 +01:00
Christian Kellner
0c7284572e osbuild: auto-generate socket addresses for APIs
Rely on the ability of `BaseAPI` to auto-generate socket addresses
when no one was provided. The `BuildRoot` does not rely on the
sockets being created in the `BuildRoot.api` directory anymore and
will instead bind-mount each individual socket address to the well
known location via the `BaseAPI.endpoint` identifier.
Convert all API providers to take the `socket_address` as an
optional keyword argument.
2020-07-27 12:50:38 +01:00
Christian Kellner
6f26b49b9f api: ability to auto-generate socket addresses
Make the `socket_address` argument to `BaseAPI` optional, i.e.
allow it to be `None`. In that case, create a temporary directory
and place the socket, named with the value of `endpoint`, in that
directory. On context exit, the directory is cleaned up. As long
as the jsoncomm.Socket server is running, `socket_address` will
always be valid and indicating the address of the server.
2020-07-27 12:50:38 +01:00
Christian Kellner
28947f3bae util/jsoncomm: support PathLike
Add support for `util.types.PathLike` paths for socket addresses,
instead of just plain strings. Test it by using pathlib.Path to
create the address in the corresponding test.
2020-07-27 12:50:38 +01:00
Christian Kellner
0aa44c23bb objectstore: use types.PathLike
Use the new `types.PathLike`, which is exactly the type that this
module defined too.
2020-07-27 12:50:38 +01:00
Christian Kellner
2fd83ac90d util: add types module defining PathLike type
Add a simple new module meant to define types that are commonly
used throughout the code-base. For starters, define `PathLike`
meant to represent file system paths, i.e. strings, bytes, or
anything that provides the `os.PathLike` protocol, i.e. that
can be used with `os.fspath`.
2020-07-27 12:50:38 +01:00
Christian Kellner
21a60324bc buildroot: bind mount individual API endpoints
The current way API end points, i.e. sockets for API providers,
are provided to the sandbox is via a temporary directory that
is created by `BuildRoot` which later gets bind-mounted to a well
known path, i.e. /run/osbuild/api inside the sandbox. API providers
are expected to create their socket in that temporary directory.

Now that `BuildRoot` has a `regsiter_api` method and each API has
an `endpoint` property, the socket of each API provider, no matter
where it is located, will get bind-mounted individually inside
the sandbox at /run/osbuild/api using the `endpoint` identifier.

For backwards compatibility reasons the temporary api directory
will still be created by `BuildRoot`, but it is no longer bind
mounted inside the container. This paves the way to remove that
directory completely once all API providers are converted to not
use that directory anymore.
2020-07-27 12:50:38 +01:00
Christian Kellner
bc81e68727 api: each API defines its 'endpoint' name
Add a new abstract class property to `BaseAPI` called `endpoint`,
meant to be implemented by deriving classes in order to identify
the end point name for the API provider.
Implement the new property in all existing API providers.
2020-07-27 12:50:38 +01:00
Christian Kellner
144019a40c pipeline: use buildroot.regsiter_api
Register all API end point providers with the `BuildRoot` via the
new `BuildRoot.register_api` call. The context management is now
done via the `BuildRoot` itself.
2020-07-27 12:50:38 +01:00
Christian Kellner
03c5cfb37e buildroot: ability to register api endpoints
Add a new `register_api` method that is meant to be used by clients
to register API end point providers, i.e. instances of `api.BaseAPI`.
When the context of the `BuildRoot` is enter, all providers are
activated, i.e. their context is entered. In case `regsiter_api` is
called with an already active context, the provider will immediately
be activated. In both cases their lifetime is thus bound to the
context of the `BuildRoot`. This also means that they are cleaned-up
with the `BuildRoot`, i.e. when its context is exited.
2020-07-27 12:50:38 +01:00
Christian Kellner
58e9211d71 api: implement canonical setup_stdio method
The `api.API` provides a `setup-stdio` method, that is meant to
be used by clients to replace their stdio with the supplied fds
from the server. Provide a canonical `api.setup_stdio` method
that will do exactly that.
2020-07-27 12:50:38 +01:00