Commit graph

74 commits

Author SHA1 Message Date
Ondřej Budai
cf046fcaeb osbuild: fix stages caching
We have never tried to reuse the first stage due to fact that range in for loop didn't include zero index. This commit fixes it.
2019-09-03 22:11:54 +02:00
Tom Gundersen
ba6918f945 osbuild: allow additional an additional build-pipeline to be prepended
The best practice for creating a pipeline should be to include at least
one level of build-pipelines. This makes sure that the tools used to
generate the target image are well-defined.

In principle one could add several layers, though in pracite, one would
hope that the envinment used to build the buildroot does not affect the
final image (and as we anyway cannot recurr indefinitely, we fall back
to simply using the host system in this case).

This only makes sense, if the contents of the host system truly does not
affect the generated image, and as such we do not include any information
about the host when computing the hash that identifies a pipeline.

In fact, any image could be used in its place, as long as the required
tools are present. This commit takes advantage of that fact. Rather than
run a pipeline with the host as the build root, take a second pipeline
to generate the buildroot, but do not include this when computing the
pipeline id (so it is different from simply editing the original JSON).

This is necessary so we can use the same pipelines on significantly
different host systems (run with different --bulid-pipeline arguments).
In particular, it allows our test pipelines that generate f30 images
to be run unmodified on Travis (which runs Ubuntu).

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-30 12:00:47 +02:00
Tom Gundersen
679b79c5e5 osbuild: split package into separate files
Import modules between files using the syntax `from . import foobar`,
renaming what used to be `FooBar` to `foobar.FooBar` when moved to a
separate file.

In __init__.py only import what is meant to be public API.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-21 09:56:50 +04:00
Tom Gundersen
9741087e92 Stage/init: fix the order of arguments
Make the order of argumnets in line with how it is used (and also
how it is conceptionally closer to the pipeline json document).

This makes no practical difference as the two arguments were both
just used for computing the hash.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-13 17:13:13 +02:00
Tom Gundersen
6d7cd1b93c Pipeline: drop the base concept
Each pipeline is now self-contained without references to another.
However, as the final stage in a pipeline is saved to the content
store, we are able to reuse it if one pipeline is the prefix of
another, as described in the previous commit. This makes the
concept of a base redundant.

The ObjectStore must take a directory as argument, never None, so
the conditional assertion for this in Pipeline.run() is ok to
remove.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-13 17:13:13 +02:00
Tom Gundersen
67da9e0cca Pipeline: reuse existing trees if they exist in the object store
Don't do this only for the base, but for any prefix of the current
pipeline.

Note that if two pipelines share a prefix, but one is not the prefix
of another, no sharing is possible. Only a proper prefix can be
reused by another pipeline, as only the result of the last pipeline
is saved to the object store (this restriction could be changed in
the future).

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-13 17:13:13 +02:00
Tom Gundersen
0a9223b6f2 Pipeline: drop the build setter
Take this as an argumnet to __init__ in the same way that `base`
is.

This avoids us having to deal with the case of someone setting a
stage before the build, which does not work as the stage id will
be wrong.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-13 17:13:13 +02:00
Lars Karlitski
68832d2aaf osbuild: check for right errno from os.rename()
Renaming a directory over an existing one is only an error if the
existing one is not empty, in which case ENOEMPTY is thrown.

Tested with:

    >>> os.mkdir("foo")
    >>> os.mkdir("bar")
    >>> os.rename("foo", "bar")
    # no error

    >>> open("foo/a", "w").write("a")
    1
    >>> try: os.rename("bar", "foo")
    ... except OSError as e: e.errno == errno.ENOTEMPTY
    ...
    True
2019-08-12 13:06:18 +02:00
Lars Karlitski
054fea3d83 osbuild: add description() methods
We already allow loading from a description. This adds the opposite
direction to export Pipelines, Stages, and Assemblers.
2019-08-07 10:01:17 +02:00
Tom Gundersen
9371eb9eaa ObjectStore/get_tree: make sure to clean up the context manager
Even if the yield raises an exception, we must always unmount to
clean up.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-02 01:05:47 +02:00
Tom Gundersen
dcc9384ba8 Pipeline: add support for a build pipeline
The build pipeline, is a sub-pipeline used to generate the build
tree to use rather than the current root directory. This can be
nested arbitrarily deep, but ultimately we will fall back to the
current logic when no build property is found.

Just like the tree after the last stage of a regular pipeline ends
up in the object store, so does currently each build tree (as the
build sub-pipeline really is just a regular pipeline in its own
right). We may want to avoid both these instances of the implicit
storing semantics, and rather make it something the caller opts-in
to. However, for now that is left as a future optimization.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-02 00:57:28 +02:00
Tom Gundersen
7c7fcecd47 ObjectStore: add an object store class
This also changes the structure of the object store, though the
basic idea is the same.

The object store contains a directory of objects, which are content
addressable filesystem trees. Currently we only ever use their
content-hash internally, but the idea for this is basically Lars
Karlitski and Kay Sievers' `treesum()`. We may exopse this in the
future.

Moreover, it contains a directory of refs, which are symlinks named
by the stage id they correspond to (as before), pointing to an object
generated from that stage-id.

The ObjectStore exposes three method:
`has_tree()`: This checks if the content store contains the given tree.
If so, we can rely on the tree remaining there.
`get_tree()`: This is meant to be used with a `with` block and yields
the path to a read-only instance of the tree with the given id. If the
tree_id is passed in as None, an empty directory is given instead.
`new_tree()`: This is meant to be used with a `with` block and yields
the path to a directory in which the tree by the given id should be
created. If a base_id is passed in, the tree is initialized with the
tree with the given id. Only when the block is exited successfully
is the tree written to the content store, referenced by the id in
question.

Use this in Pipeline.run() to avoid regenerating trees unneccessarily.
In order to trigger a regeneration, the content store must currently
be manually flushed.

Update the travis test to run the noop pipeline twice, verifying that
the stage is only run the first time.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-01 22:39:52 +02:00
Tom Gundersen
5eaa553563 BuildRoot: require the root directory to be passed in
Rather than hard-coding this to /, let the caller provide the
directory path to use.

In the past, we needed to give special treatment to /, as it had
to be bind-mounted before being used by nspawn, to work around a
check they had, refusing to use the host root in the container.

We no longer pass the directory directly to nspawn, but rather
mount the subdirs we want ourselves, so that no longer applies.

The callers pass in /, so the behavior is unchanged.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-08-01 22:39:52 +02:00
Tom Gundersen
659ce42c83 BuildRoot: don't use nspawn's --volatile mode
We want the same functionality, but we now impleent it ourselves.

In addition to bind-mounting in /usr into the target container
(which is all nspawn does), we also add /bin, /sbin, /lib and
/lib64, if they exist and are not symlinks (presuambly into
/usr).

This means we can work on distros who have not implemented the
usr-move, like Ubuntu Bionic (used by Travis).

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-31 01:34:31 +02:00
Tom Gundersen
98ce5a7595 TmpFs: do not mount in __init__
The underlying filesystem was mounted in __init__ and unmonuted in
__exit__/__del__. This meant that if the same object was reused in
several `with` clauses, only the first one would work as intended.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-25 23:55:43 +02:00
Tom Gundersen
79b2f37cbc loop: add direct-io support
Support the LOOP_SET_DIRECT_IO ioctl, which alows us to control
whether or not a loopback device should perform its own buffering
or rely on the one done by the underlying backing file.

Enabling this should improve both throughput and memory consumption,
it is not currently hooked up as more testing would be required.
2019-07-25 23:55:43 +02:00
Lars Karlitski
5b50dec8c5 osbuild: add -l/--libdir parameter
Stop guessing if we're in the source directory by looking if a `stages`
subdirectory exists. Instead, assume that osbuild is installed on the
host.

If `--libdir` is given, mount the libdir into `/run/osbuild/lib` (alas,
we can't overwrite `/usr/libexec/osbuild`) and run osbuild from there.
Thus, running from source must now be done like this:

    # python3 -m osbuild --libdir . [other args]
2019-07-24 12:55:48 +02:00
Martin Sehnoutka
e23fdb2b45 move stages and assemblers into /usr/libexec/ 2019-07-23 20:21:53 +02:00
Lars Karlitski
1e92e56b49 osbuild: remove systemResourcesFromEtc
It is a kludge that doesn't fit into osbuild's model. It's also not
necessary for any hacks anymore.
2019-07-19 13:31:49 +02:00
Tom Gundersen
96ea4e5698 BuildRoot: do not register with systemd-machined
This really only makes sense if we are running systemd as PID1
inside the container, but we are not booting a system, just using
it as a glorified chroot.

This means entering the namespaces from the outside will be a bit
more cumbersome, but that was not used much and was never reliable
to begin with.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 01:27:17 +02:00
Tom Gundersen
670d51a746 BuildRoot: drop unused device permissions
We only need permissions for loop devices, not for loop-control
and not for partitions.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
7274847711 Assembler: no longer mount devtmpfs in the container
Move the only assembler that relied on this to use LoopClient instead.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
d855c6c35e Assembler: add the remoteloop API to the assembler build roots
Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
c124ab264b loop: add helpers to use IPC to create loop devices
loop.py is a simple wrapper around the kernel loop API. remoteloop.py
uses this to create a server/clinet pair that communicates over an
AF_UNIX/SOCK_DGRAM socket to allow the server to create loop devices
for the client.

The client passes a fd that should be bound to the resulting loop
device, and a dir-fd where the loop device node should be created.
The server returns the name of the device node to the client.

The idea is that the client is run from whithin a container without
access to devtmpfs (and hence /dev/loop-control), and the server
runs on the host. The client would typically pass its (fake) /dev
as the output directory.

For the client this will be similar to `losetup -f foo.img --show`.

[@larskarlitski: pylint: ignore the new LoopInfo class, because it
only has dynamic attributes. Also disable attribute-defined-outside-init,
which (among other problems) is not ignored for that class.]

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
5d5766a98a BuildRoot: include the osbulid module in the containers
This way we have access to all our helper libraries from the stages
and assemblers.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
e2dc0c5081 BuildRoot: add support for an API sockets dir
Add a directory to each BuildRoot potentially containing a set of
sockets. Also add a helper to create a named bound socket in a given
BuildRoot.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
18ebce3016 Stage: don't bind-mount devtmpfs into the stage buildroot
We only use devices in Assemblers so this is not needed.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Tom Gundersen
8994e7f803 libdir: allow osbulid to be built from the tree
This was a regression from when osbuild was made into a module.
Restore the old behavior.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-19 00:51:12 +02:00
Martin Sehnoutka
b218430bfa add __main__ file for convenient CLI usage 2019-07-18 10:17:39 +02:00
Martin Sehnoutka
23680736bb create osbuild package 2019-07-18 10:17:39 +02:00
Lars Karlitski
781fc73176 osbuild.py: make StageFailed and AssemblerFailed consistent
This way they can both be used as ducks in an exception handler.
2019-07-17 10:36:13 +02:00
Lars Karlitski
baa5da6abe osbuild.py: clean up Pipeline API
Separate deserialization into a new `osbuid.load()` function.

Pass objects to `Pipeline.run()` instead of keeping it in an attribute
on the pipeline.
2019-07-17 10:36:13 +02:00
Martin Sehnoutka
a837cc5e82 make /run/osbuild in the pipeline class 2019-07-11 12:34:32 +02:00
Tom Gundersen
65151e22ff osbuild.py: assign ids to stages rather than to pipelines
Compute a hash based on the content of a stage, together with the
hash of its parent stage.

The output of a pipeline is saved by the id of the last stage.

This is largely equivalent to the current logic, where it is the
pipeline that contains the id, but this means that the ids are
indepedent of how pipelines are split, the only thing that matters
is the sequence of stages, not whether or not they are in one or
several interdependent pipelines.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-09 12:41:26 +02:00
Tom Gundersen
cebed27cd9 osbuild: drop the concept of an input_dir
This removes the possibility of passing in arbitrary input data. We
now restrict ourselves to explicitly specified files/directories or
a base tree given by its pipeline id.

This drops the tar/tree stages/assemblers, as the tree/untree ones
are implicit in osbuild, and if we wish to also support compressed
trees, then we should add that to osbuild core as an option.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-03 13:11:37 +02:00
Tom Gundersen
e607053c32 osbuild.py/pipeline: add the concept of a content store
Whenever an assembler is not specified, the output tree is instead
saved to the content store, in a directory named after the pipeline
id.

This should render the io.weldr.tree assembler redundant.

In order to build the samples as before, specify the content store
as the input directory to build any pipeline that uses the
io.weldr.untree stage.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-03 13:11:37 +02:00
Tom Gundersen
f25cffa151 osbuild.py/pipeline: assign a unique id to every pipeline
This uniquely identifies a pipeline based on its content. Pipelines
are considered equal modulo whitespace and the order of object
elements.

The intention is that two runs of a pipeline with the same id
generates functionaly equivalent ids. It is up to the writers
of stages and pipelines to ensure this property holds.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-03 13:11:37 +02:00
Tom Gundersen
1219f1dc55 osbuild.py: introduce a Pipeline class
This is not a functional change, it is just a wrapper class around the
pipeline state.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-07-03 13:11:37 +02:00
Lars Karlitski
767b249b2d osbuild: move run() into osbuild.py
This allows for running a pipline from python and for non-interactive
mode, in which all output is captured.
2019-07-01 17:01:26 +02:00
Martin Sehnoutka
1d7f2461a0 make named parameters mandatory 2019-06-21 15:44:40 +02:00
Tom Gundersen
28fd21ba40 osbuild: allow empty output dir
We wanted to force an empty output dir to avoid assembly stages using
previous output when creating their new one, and hence creating
dependencies between osbuild runs. We may still do that, but for now
let's remove the restriction as it seems rather arbitrary to protect
people from themselves to this extent.

Signed-off-by: Tom Gundersen <teg@jklm.no>
2019-06-19 16:07:43 +02:00
Lars Karlitski
2d487fe685 osbuild.py: add systemResourcesFromEtc key
Some programs have resources in /etc (for example, /etc/pki or grub's
config scripts). Give stages a way to access these.
2019-06-14 20:29:14 +02:00
Lars Karlitski
bd87038210 osbuild.py: separate tree from build root
There's no reason to conflate the two. This allows us to build on
something other than a tmpfs.
2019-06-14 19:54:42 +02:00
Lars Karlitski
ce0b01e93d osbuild: remove --sit
It's not really useful because it's at the wrong place, after a stage
has torn down all mounts. It also makes the code more complex for too
little benefit.
2019-06-14 18:38:13 +02:00
Lars Karlitski
2dbd177b0f osbuild.py: add BuildRoot.run_assembler()
This is the canonical way to run an assembler.

Also improve error handling by introducing a StageFailed exception.
2019-06-13 21:07:23 +02:00
Lars Karlitski
b36c8135ae osbuild: split BuildRoot into a reusable module 2019-06-13 20:01:53 +02:00
Lars Karlitski
abca9d7b03 osbuild: mount build root and tree in /run/osbuild 2019-06-13 19:30:57 +02:00
Lars Karlitski
e7b8f757d4 osbuid: introduce libdir
Run stages and the runner from a libdir, which is either $prefix/lib or
the current directory.
2019-06-13 19:30:19 +02:00
Lars Karlitski
48f8a7fc2a osbuild: pass arguments to main() explicitly 2019-06-13 16:16:33 +02:00
Lars Karlitski
0178cce4ee osbuild: make --input and --output absolute paths 2019-06-13 16:04:42 +02:00