Change the assembler-commit to be conditional on checkpoints, just like
we already do for stages. This means, assembler output is not
automatically committed, but only if you requested so via a checkpoint.
With this in place we can start sharing caches in osbuild-composer. The
only thing in the cache will be sources as well as checkpointed stages.
We can start checkpointing known pipelines and thus make use of the
cache. Furthermore, we can cache sources, as long as we do not fetch an
unbound set of RPMs. However, our RPM set is currently static, so this
should not be an issue. Nevertheless, it is up to Composer to decide
when to enable the cache.
We no longer need the `tree_id` in the osbuild output. All callers have
been converted to use other means. Drop the ID from the output and
avoid exposing our internals.
Now that no caller requires the "output_id" anymore, drop it from our
results-dictionary. Instead, pass the output-directory through and copy
outputs where we produce / fetch them.
This still uses `objectstore.resolve_ref()`, since we do not have the
outputs pinned at the places where we want to copy. This needs a little
bit more rework, but we might just delay that until we have the cache
rework landed.
This already simplifies the output-directory path and drops the slight
hack which checked very late for produced outputs.
Note that we must be careful not to copy things too early, because we
do not want remnants in the output-directory if we return failure.
Hence, keep the copy-operation close to the commit-operation on the
store.
All callsites of `Pipeline.assemble()` already check early whether the
output-object exists in the store and then return it. Checking again in
`assemble()` will never catch anything (unless another stage would
happen to produce the same ID as the assembler as a side-effect).
It does seem useful to keep the shortcuts in `assemble()`, so other
callers would get the shortcut as well. However, this does not really
work well right now, since you want to skip the stage-compilation as
well, and `assemble()` is really just the last step of the job. Hence,
it really is the job of the pipeline-executor to check early.
With that in mind, lets drop this fast-path which has no effect in the
current setup.
We want to get rid of `tree_id` and `output_id` because the they
are now considered internals of the store and clients should not
use them directly. NB: they are still there indirectly as the id
of the last stage and the assembler.
Also, the `output_id` was never correct here, because it was the
`tree_id` as well. Ups.
Make sure to verify that the pipeline actually produced any output
before attempting to copy it out. This fixes osbuild running with
`--output-directory` but without assembler.
The idea is that source can themselves spawn other modules, esp.
new secrets modules. For this they need to know the library dir,
aka 'libdir' throughout the osbuild source. Therefore change the
SourceServer to directly get the library directory instead of
just the sub-directory to the sources. Then pass the library
directory to via the JSON API to the source.
Adjust all usage of the SourceServer, including the tests.
Drop the `kwargs` forwarding from buildroot.run() to subprocess.run().
We do not use it other than for `stdin=subprocess.DEVNULL`. Set that
option directly instead.
Doing the kwargs forwarding mixes the argument namespaces and is very
hard to read. It is not clear from the call-site which argument goes to
buildroot.run() and which to subprocess.run().
Lastly, it requires us to manually fetch `check` just to make pylint
happy. Lets just drop this dance and make the API explicit.
Drop the --build-env command-line argument. It is not used by anything.
Furthermore, our manifests now allow embedding build-environments, so
there is little reason to continue supporting this.
Add a option to all description methods to include the respective
ids in the description. Defaults to False to preserve the original
output which is used in the tests.
We want to run stages and other scripts inside of the nspawn containers
we use to build pipelines. Since our pipelines are meant to be
self-contained, this should imply that the build-root must have osbuild
installed. However, this has not been the case so far for several
reasons including:
1. OSBuild is not packaged for all the build-roots we want to support
and thus we have the chicken-and-egg problem.
2. During testing and development, we want to support using a local
`libdir`.
3. We already provide an API to the container. Importing scripts from
the outside just makes this API bigger, but does not change the
fact that build-roots are not self-contained. Same is true for the
running kernel, and probably much more..
With all this in mind, our strategy probably still is to eventually
package osbuild for the build-root. This would significantly reduce our
API exposure, points-of-failure, and host-reliance. However, this switch
might still be some weeks out.
With this in mind, though, we can expect the ideal setup to have a full
osbuild available in the build-root. Hence, any script we import so far
should be able to access the entire `libdir`. This commit unifies the
libdir handling by installing the symlinks into `libdir` and providing
a single bind-mount of the module-path into `libdir`.
We can always decide to scratch that in the future when we scratch the
libdir-import from the host-root. Until then, I believe this commit
nicely unifies the way we import the module both in a local checkout as
well as in the container.
Add a new output-directory argument which specifies where to store
result objects. For now, this is purely optional and simply copies from
the old `output_id` into the specified directory. This allows a
backwards compatible transition towards removing any external access to
the osbuild cache.
Note that this has still lots of room for improvements:
* We only support assembler-output for now, but we could also easily
support entire trees as output, in case no assembler was selected.
Alternatively, we could introduce a "copy" assembler, that just
outputs the input tree.
* This parameter is optional, but should really be mandatory. There
is little reason to have the default behavior just dropping any
generated content. This would be a breaking change, though.
* We could move data out of a temporary object-store entry, rather
than copy it. But again, for backwards-compatibility, we leave the
latest store-object intact and do not move things out of it.
* We could now transition towards never committing anything to the
store, not even output IDs, unless explicitly checkpointed.
Causes a problem with ostree-osbuild on CI (travis) otherwise:
Traceback (most recent call last):
File "osbuild-ostree", line 345, in <module>
sys.exit(main())
File "osbuild-ostree", line 337, in main
return build(args)
File "osbuild-ostree", line 257, in build
output_id, commit_id = build_commit(builddir, args)
File "osbuild-ostree", line 162, in build_commit
r = pipeline.run(store.store,
File "/home/travis/build/gicmo/ostree-osbuild-demo/osbuild/osbuild/pipeline.py", line 358, in run
r = self.assemble(object_store,
File "/home/travis/build/gicmo/ostree-osbuild-demo/osbuild/osbuild/pipeline.py", line 314, in assemble
r = self.assembler.run(input_dir,
File "/home/travis/build/gicmo/ostree-osbuild-demo/osbuild/osbuild/pipeline.py", line 148, in run
osbuild_module_path = os.path.dirname(importlib.util.find_
Move the whole result handling of the assembler outside the context
manager; this includes the cleanup of the object in the error case
which would conflict with the ongoing write operation inside the
context manager and thus lead to a crash:
Traceback (most recent call last):
File "/usr/bin/osbuild", line 11, in <module>;
load_entry_point('osbuild==10', 'console_scripts', 'osbuild')()
File "/usr/lib/python3.7/site-packages/osbuild/__main__.py", line 99, in main
secrets=secrets
File "/usr/lib/python3.7/site-packages/osbuild/pipeline.py", line 362, in run
libdir)
File "/usr/lib/python3.7/site-packages/osbuild/pipeline.py", line 324, in assemble
output.cleanup()
File "/usr/lib/python3.7/site-packages/osbuild/objectstore.py", line 160, in cleanup
self._check_writer()
File "/usr/lib/python3.7/site-packages/osbuild/objectstore.py", line 178, in _check_writer
raise ValueError("Write operation is ongoing")
ValueError: Write operation is ongoing
If there is a build pipeline specified, always build it, even if
there are no accompanying stages. If we short-circuit earlier and
ignore the build pipeline section, errors in the build pipeline
would not be caught at all.
The `build_stages` method short-circuits and returns early in case
any of the stages fail to build and returns None for the tree, and
build tree, therefore both of those can immediately cleaned up at
that point.
For this add a small helper `cleanup` that will call the cleanup
method for all supplied arguments, after filtering out None values.
Delay the cleanup of the build tree of the build pipeline, and
first check the result and only cleanup the tree when the build
did not fail, because in that case both returned trees will be
None and trying to cleanup them up will result in an exception.
Therefore, also don't clean up `tree` in the error case.
If the final object, image, artifact, already exists in the store,
short-circuit and return directly from `Pipeline.run`. Otherwise
the situation might arise that the final result is in the store,
but the tree (and build trees) are not and thus the tree would be
built, just to be thrown away when the assembler phase detects
that the final output already exists.
Extract the code that assembles the tree into its own method as
it was previously done for the stages. This should make the new
method as well as `Pipeline.run` method easier to read.
Refactor the building of stages and the build tree so that no auto
commit is done at the end of the build pipeline anymore, i.e. the
respective build tree(s) are not commit to the store unless that
was explicitly enabled via a checkpoint.
NB: `objectstore.Object`s are used not via a context manager
anymore, because they are returned from the `build_stages` method
to make the code easier to use and read. Cleanup of Objects during
a KeyboardInterrupt exception (Ctrl-C) are handled by using the
ObjectStore with a context manager, which on exit of the context
will cleanup all objects. Due to a big in python[1] this is indeed
more robust than using `with object_store.new() as tree` because
that is translated[2] to something like:
1: mgr = (EXPR)
2: exit = type(mgr).__exit__
3: value = type(mgr).__enter__(mgr)
-> 4: # NOTE: KeyboardInterrupt here will "leak" value
5: try:
6: [...]
7: finally:
8: if exc:
9: exit(mgr, None, None, None)
Which can leave the tree initialized but not cleaned up if the
KeyboardInterrupt happens exactly line 4.
[1] https://bugs.python.org/issue29988
[2] https://www.python.org/dev/peps/pep-0343/
Only commit checkpoints to the object store if the run of the
stage or assembler was successful. Otherwise we commit a empty,
corrupted or old tree to the store. Any subsequent run might
then pick up that bogus tree as a starting point.
Rename the function to `describe_os()`. We do no actual detection, nor
verification here. That is, the return value of this function is in no
way guaranteed to be a valid runner. That is, error-handling needs to
be done in the caller. Make this clear by renaming the function.
Note: Currently, in case no runner exists for the OS, we end up with:
execv(...) failed: No such file or directory
This needs to be fixed in the future.
Reading from an `Object` via `read` already uses a context manager
to manage the read-only bind mount and also maintain a count of
currently active readers. With this an attempt to start a new
`write` operation while readers were active can be detected and
an exception is throw. Since `write` was not introducing a context
the inverted situation, i.e. reads while a write is ongoing, was
not possible to detect.
This commit therefore introduces a context also for `.write` so
that we can enforce the policy to have either many readers but no
writers, or just one writer and no readers.
A bind mount is also used for write (in read-write mode) to hide
the internal path of the tree.
The exception that was thrown by {stage.run, assembler.run} was a
necessary ingredient that in combination with the context manager
around `Objectstore.new` made sure that tree the object was only
auto-committed to the store when there was no error during the
executing of any of the `.run` methods.
Now that the auto-commit feature got removed and committing of any
object to the store is explicitly done via `objectstore.commit`,
the whole exception throwing and handling can be removed. Status
reporting was already done in `BuildResult.success` and the new
code will use that to exit the function early on stage/asm errors.
Refactor `get_buildtree` to do input/output via `Object`, i.e. by
creating a new `Object`, setting its base accordingly and then
use its `read` and `write` methods. This is what `ObjectStore.get`
does as well.
In the case that there is no build pipeline, use the mount helpers
of `objectstore` instead of the custom mount calls.
Do not automatically commit the last stage of the pipeline to the
store. The last stage is most likely not what should be cached,
because it will contain all the individual customization and thus
be very likely different for different users. Instead, the dnf or
rpm stages have a higher chance of being the same and thus are
better candidates for caching.
Technically this change is done via two big changes that build
upon new features introduces in the previous commits, most notably
the copy on write semantics of Object and that input/output is
being done via `objectstore.Object` instead of plain paths. The
first of the two big changes is to create one new `Object` at
the beginning of `pipeline.run` and use that, in write mode via
`Object.write` across invocations of `stage.run` calls, with
checkpoints being created after each stage on demand.
The very same `Object` is then used in read mode via `Object.read`
as the input tree for the Assembler. After the assembler is done
the resulting image/tree is manually committed to the store.
The other big change is to remove the `ObjectStore.commit` call
from the `ObjectStore.new` method and thus the automatic commit
after the last stage is gone.
NB: since the build tree is being retrieved in `get_buildtree`
from the store, a checkpoint for the last stage of the build
pipeline is forced for now. Future commits will refactor will
do away with that forced commit as well.
Change osbuildtest.TestCase to always create a checkpoint at
the final tree (the last stage of the pipeline), since tests
need it to check the tree contents.
As a result of the previous commits that implement copy on write
semantics, `commit` can now be used to create snapshots. Whenever
an Object is committed, its tree is moved to the store and it is
being reset, i.e. a new clean workdir is created and the old one
discarded. The moved tree is then set as the base of the reset
Object. On the next call to `write` the moved tree will be copied
over and forms the basis of the Object again. Should nobody want
to write to Object after the snapshot, i.e. the `commit`, no copy
will be made.
NB: snapshots/commits will act now act as synchronization points:
if a object with the same treesum, i.e. the very same content
already exists, the move (i.e. `store_tree`) will gracefully fail
and the existing content will be set as the base for Object.
Since Object knows its base now, the initialization of the tree
with the content of its base can be delayed until the moment
someone wants to actually modify the tree, thus implementing
copy on write semantics. For this a new `write` method is added
that will initialize the base and return the writable tree. It
should be used instead of `path` whenever the a client wants to
write to the tree of the Object.
Adapt the pipeline and the tests to use the new `write` method
in all the appropriate places.
NB: since the intention can not be inferred when using `path`
directly, the Object is still being initialized there.
Refactor the `ObjectStore.snapshot` method to take an `Object` not
a plain filesystem tree, so the latter is more encapsulated from
the ObjectStore user (e.g. the pipeline) and prepares a unified
code-path for `snapshot` and `commit` in the future.
Instead of just returning the path of the temporary object that is
created in .new() the actual instance of the new `Object` is being
returned, which can then provide a richer interface for clients
than a plain directory path.
Detect the host dynamically from os-release(5) instead of relying on the
`org.osbuild.host` symlink.
It is awkward to install a symlink that tells osbuild which distro is is
running on, when there is a standard way to detect this.
This makes it easier to run osbuild from sources and removes the need to
include every host in the spec file. The latter became hard to do,
because there's no obvious way to distinguish RHEL minor releases.
Make the sources options a static property of the pipeline, in
particular of each stage, rather than being passed in on `run()`.
This more closely matches the intended semantics of sources and
pipeline having similar lifetimes and being fairly coupled together.
The difference between the pipeline and the sources is that the
sources do not contribute to identifying the pipeline (they are not
part of the hash for the pipeline id), and they could be swapped
out without changing the output image (as long as they are valid).
However, a pipeline without A sources object would not be useful,
and typically the pipeline and the sources are generated, passed
around and used together.
This is different from the build environment and the secrets object,
which both are specific to either the host or the caller, unlike
the pipeline which should be universal.
This changes the `load()` function to take a `manifest`, which is
a map containing both the pipeline and the sources.
Note that the semantics of the build-env parameter remains unchanged:
It shares the sources with the rest of the pipeline. We may want to
reconsider this in future commits, as the build-env is specific to
the host, whereas the regular pipeline is not.
Signed-off-by: Tom Gundersen <teg@jklm.no>
This source adds support for downloaded files. The files are
indexed by their content hash, and the only option is their URL.
The main usecase for this will be downloading rpms. Allowing depsolving
to be done outside of osbuild, network access to be restricted and
downloaded rpms to be reused between runs.
Each source is now passed two additional arguments, a cache directory
and an output directory. Both are in the source's namespace, and
the source is responsible for managing them. Each directory may
contain contents from previous runs, but neither is ever guaranteed
to do so.
Downloaded contents may be saved to the cache and resued between
runs, and the requested content should be written to the output dir.
If secrets are used, the source must only ever write contents to
the output that corresponds to the available secrets (rather than
contents from the cache from previous runs).
Each stage is passed an additional argument, a sources directory.
The directory is read-only, and contains a subdirectory named after
each used source, which will contain the requseted contents when
the `Get()` call returns (if the source uses this functionality).
Based on a patch by Lars Karlitski.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Add a new `--checkpoint` option, which can be provided multiple
times, that indicate after which stages a the current stage of
the tree should be committed to the object store; the tree id
will be the treesum of the tree at that point and a reference
is created with the id of the stage at the point.
The argument to `--checkpoint` is the id of the stage. If not
all the given checkpoints can be found the execution will be
aborted.
This makes sure all disk access is backed by the same disk. We may
want this for performance reasons (avoiding moving across disks), but
also to experiment with different backing stores for all disk access.
Signed-off-by: Tom Gundersen <teg@jklm.no>
The dnf stage wants to import `osbuild.sources` but currently the
osbuild module is not available in the stages. Apply the same hack
done in the Assembler also in for the stages, i.e. bind mount the
osbuild module to the stages/osbuild.
Add a new command line option `--secrets`, which accepts a JSON file
that is structured similarly to a source file. It is should contain data
that is necessary to fetch content, but shouldn't appear in any logs.
This might (hopefully) fix a race in destructing the asyncio.EventLoop
that's used in all API classes, which leads to warnings about unhandled
exceptions on CI.
This also puts their creation closer to where the client-side sockets
are created.
Pipelines encode which source content they need in the form of
repository metadata checksums (or rpm checksums). In addition, they
encode where they fetch that source content from in the form of URLs.
This is overly specific and doesn't have to be in the pipeline's hash:
the checksum is enough to specify an image.
In practice, this precluded using alternative ways of getting at source
packages, such as local mirrors, which could speed up development.
Introduce a new osbuild API: sources. With it, a stage can query for a
way to fetch source content based on checksums.
The first such source is `org.osbuild.dnf`, which returns repository
configuration for a metadata checksum. Note that the dnf stage continues
to verify that the content it received matches the checksum it expects.
Sources are implemented as programs, living in a `sources` directory.
They are run on the host (i.e., uncontained) right now. Each source gets
passed options, which are taken from a new command line argument to
osbuild, and an array of checksums for which to return content.
This API is only available to stages right now.
The recent changes removed the {Assembler,Stage}Failed exceptions,
which includes them being thrown from Stage.run and Assembler.run.
Instead result dictionaries are returned even on errors. But the
object store, used as a context manager, relies on exceptions to
detect the error case and thus needs them to cleanup the temporary
objects. Without those exceptions the temporary objects end up in
the store even when the sage or assembler failed.
Restore the old behavior by throwing a generic BuildError exception
from the Stage and Assembler, which will be caught directly in the
pipeline and converted to a result dict.
Commit 82a2be53d introduced a new return type from `Pipeline.run()`. It
changed the caller in `__main__.py`, but missed that the build pipeline
uses the same function.
A pipeline run only returned logs in the `StageFailed` and
`AssemblerFailed` exceptions. Remove those and always return structured
data instead.
It only returns data for stages that actually ran (i.e., didn't come
from the cache). This is similar to the output in interactive mode.
Also change osbuildtest to be able to deal with output that is larger
than the pipe buffer by using subprocess.communicate().
osbuild currently throws an error when not passing a build environment
on the command line, because the runner is unset. This is annoying on
hosts which only need a runner set, but no build pipeline.
To simplify running osbuild in this common case, introduce
`org.osbuild.host`, which is a runner that is defined to work on the
host that osbuild is installed on. Use this runner by default and
include a symlink to the right runner in the Fedora and RHEL packages.
Also add `runners/org.osbuild.host` to `.gitignore`, so that developers
can set the symlink when running osbuild from the source directory.
Fixes#171
We've been using a generic `osbuild-run`, which sets up the build
environment (and works around bugs) for all build roots. It is already
getting unwieldy, because it tries to detect the OS for some things it
configures. It's also about to cause problems for RHEL, which doesn't
currently support a python3 shebang without having /etc around.
This patch changes the `build` key in a pipeline to not be a pipeline
itself, but an object with `runner` and `pipeline` keys. `pipeline` is
the build pipeline, as before. `runner` is the name of the runner to
use. Runners are programs in the `runners` subdirectory.
Three runners are included in this patch. They're copies of osbuild-run
for now (except some additions for rhel82). The idea is that each of
them only contains the minimal setup code necessary for an OS, and that
we can review what's needed when updating a build root.
Also modify the `--build-pipeline` command line switch to accept such a
build object (instead of a pipeline) and rename it accordingly, to
`--build-env`.
Correspondingly, `OSBUILD_TEST_BUILD_PIPELINE` → `OSBUILD_TEST_BUILD_ENV`.
`osbuild-run` sets up the build root so that programs can be run
correctly in it. It should be run for all programs, not just stages and
assemblers (even though they're the only consumers right now).
Also, conceptually, `osbuild-run` belongs to the build root. We'll
change its implementation based on the build root in a future commit.
The buildroot already sets up `/run/osbuild/api`. It makes sense to have
it manage libdir as well.
A nice side benefit of this is a simplification of the Stage and
Assembler classes, which grew quite complex and contained duplicate
code.