debian-forge

Author	SHA1	Message	Date
Christian Kellner	5d0f6aa981	pipeline: short-circuit if final object exists If the final object, image, artifact, already exists in the store, short-circuit and return directly from `Pipeline.run`. Otherwise the situation might arise that the final result is in the store, but the tree (and build trees) are not and thus the tree would be built, just to be thrown away when the assembler phase detects that the final output already exists.	2020-03-07 17:13:21 +01:00
Christian Kellner	b755e69bca	pipeline: extract assembler code into method Extract the code that assembles the tree into its own method as it was previously done for the stages. This should make the new method as well as `Pipeline.run` method easier to read.	2020-03-07 17:13:21 +01:00
Christian Kellner	170ccd4722	pipeline: no auto-commit for build stages Refactor the building of stages and the build tree so that no auto commit is done at the end of the build pipeline anymore, i.e. the respective build tree(s) are not commit to the store unless that was explicitly enabled via a checkpoint. NB: `objectstore.Object`s are used not via a context manager anymore, because they are returned from the `build_stages` method to make the code easier to use and read. Cleanup of Objects during a KeyboardInterrupt exception (Ctrl-C) are handled by using the ObjectStore with a context manager, which on exit of the context will cleanup all objects. Due to a big in python[1] this is indeed more robust than using `with object_store.new() as tree` because that is translated[2] to something like: 1: mgr = (EXPR) 2: exit = type(mgr).__exit__ 3: value = type(mgr).__enter__(mgr) -> 4: # NOTE: KeyboardInterrupt here will "leak" value 5: try: 6: [...] 7: finally: 8: if exc: 9: exit(mgr, None, None, None) Which can leave the tree initialized but not cleaned up if the KeyboardInterrupt happens exactly line 4. [1] https://bugs.python.org/issue29988 [2] https://www.python.org/dev/peps/pep-0343/	2020-03-07 17:13:21 +01:00
Christian Kellner	fecc62f5c8	pipeline: don't commit checkpoints on error Only commit checkpoints to the object store if the run of the stage or assembler was successful. Otherwise we commit a empty, corrupted or old tree to the store. Any subsequent run might then pick up that bogus tree as a starting point.	2020-03-03 13:38:45 +01:00
David Rheinsberg	53415a3cbc	pipeline: detect_os() -> describe_os() Rename the function to `describe_os()`. We do no actual detection, nor verification here. That is, the return value of this function is in no way guaranteed to be a valid runner. That is, error-handling needs to be done in the caller. Make this clear by renaming the function. Note: Currently, in case no runner exists for the OS, we end up with: execv(...) failed: No such file or directory This needs to be fixed in the future.	2020-02-29 12:35:19 +01:00
David Rheinsberg	cd07d588fc	pipeline: fix detect_os() default values The keys in `/etc/os-release` are not mandatory. Make sure we use their default values (defined in the man-page) if missing.	2020-02-29 12:35:19 +01:00
Christian Kellner	4b790ac284	objectstore: use a context also for Object.write Reading from an `Object` via `read` already uses a context manager to manage the read-only bind mount and also maintain a count of currently active readers. With this an attempt to start a new `write` operation while readers were active can be detected and an exception is throw. Since `write` was not introducing a context the inverted situation, i.e. reads while a write is ongoing, was not possible to detect. This commit therefore introduces a context also for `.write` so that we can enforce the policy to have either many readers but no writers, or just one writer and no readers. A bind mount is also used for write (in read-write mode) to hide the internal path of the tree.	2020-02-29 01:14:24 +01:00
Christian Kellner	2266d3fada	pipeline: plain results for stages, assembler The exception that was thrown by {stage.run, assembler.run} was a necessary ingredient that in combination with the context manager around `Objectstore.new` made sure that tree the object was only auto-committed to the store when there was no error during the executing of any of the `.run` methods. Now that the auto-commit feature got removed and committing of any object to the store is explicitly done via `objectstore.commit`, the whole exception throwing and handling can be removed. Status reporting was already done in `BuildResult.success` and the new code will use that to exit the function early on stage/asm errors.	2020-02-29 01:14:24 +01:00
Christian Kellner	29397efcec	pipeline: implement get_buildtree like store.get Refactor `get_buildtree` to do input/output via `Object`, i.e. by creating a new `Object`, setting its base accordingly and then use its `read` and `write` methods. This is what `ObjectStore.get` does as well. In the case that there is no build pipeline, use the mount helpers of `objectstore` instead of the custom mount calls.	2020-02-28 16:11:49 +01:00
Christian Kellner	42a365d12f	osbuild: no auto commit of the last stage Do not automatically commit the last stage of the pipeline to the store. The last stage is most likely not what should be cached, because it will contain all the individual customization and thus be very likely different for different users. Instead, the dnf or rpm stages have a higher chance of being the same and thus are better candidates for caching. Technically this change is done via two big changes that build upon new features introduces in the previous commits, most notably the copy on write semantics of Object and that input/output is being done via `objectstore.Object` instead of plain paths. The first of the two big changes is to create one new `Object` at the beginning of `pipeline.run` and use that, in write mode via `Object.write` across invocations of `stage.run` calls, with checkpoints being created after each stage on demand. The very same `Object` is then used in read mode via `Object.read` as the input tree for the Assembler. After the assembler is done the resulting image/tree is manually committed to the store. The other big change is to remove the `ObjectStore.commit` call from the `ObjectStore.new` method and thus the automatic commit after the last stage is gone. NB: since the build tree is being retrieved in `get_buildtree` from the store, a checkpoint for the last stage of the build pipeline is forced for now. Future commits will refactor will do away with that forced commit as well. Change osbuildtest.TestCase to always create a checkpoint at the final tree (the last stage of the pipeline), since tests need it to check the tree contents.	2020-02-28 16:11:49 +01:00
Christian Kellner	6a2a7d99f7	objectstore: unify commit and snapshot code paths As a result of the previous commits that implement copy on write semantics, `commit` can now be used to create snapshots. Whenever an Object is committed, its tree is moved to the store and it is being reset, i.e. a new clean workdir is created and the old one discarded. The moved tree is then set as the base of the reset Object. On the next call to `write` the moved tree will be copied over and forms the basis of the Object again. Should nobody want to write to Object after the snapshot, i.e. the `commit`, no copy will be made. NB: snapshots/commits will act now act as synchronization points: if a object with the same treesum, i.e. the very same content already exists, the move (i.e. `store_tree`) will gracefully fail and the existing content will be set as the base for Object.	2020-02-28 16:11:49 +01:00
Christian Kellner	39213b7f44	objectstore: copy on write semantics for Object Since Object knows its base now, the initialization of the tree with the content of its base can be delayed until the moment someone wants to actually modify the tree, thus implementing copy on write semantics. For this a new `write` method is added that will initialize the base and return the writable tree. It should be used instead of `path` whenever the a client wants to write to the tree of the Object. Adapt the pipeline and the tests to use the new `write` method in all the appropriate places. NB: since the intention can not be inferred when using `path` directly, the Object is still being initialized there.	2020-02-28 16:11:49 +01:00
Christian Kellner	25b3807a5b	objectstore: snapshot takes Object not path Refactor the `ObjectStore.snapshot` method to take an `Object` not a plain filesystem tree, so the latter is more encapsulated from the ObjectStore user (e.g. the pipeline) and prepares a unified code-path for `snapshot` and `commit` in the future.	2020-02-28 16:11:49 +01:00
Christian Kellner	d10537da42	objectstore: yield Object not path from .new() Instead of just returning the path of the temporary object that is created in .new() the actual instance of the new `Object` is being returned, which can then provide a richer interface for clients than a plain directory path.	2020-02-28 16:11:49 +01:00
Lars Karlitski	a578a2b7e7	pipeline: detect host instead of using org.osbuild.host Detect the host dynamically from os-release(5) instead of relying on the `org.osbuild.host` symlink. It is awkward to install a symlink that tells osbuild which distro is is running on, when there is a standard way to detect this. This makes it easier to run osbuild from sources and removes the need to include every host in the spec file. The latter became hard to do, because there's no obvious way to distinguish RHEL minor releases.	2020-02-28 16:06:30 +01:00
Tom Gundersen	481213a8dd	pipeline: pin the sources options in the pipeline object Make the sources options a static property of the pipeline, in particular of each stage, rather than being passed in on `run()`. This more closely matches the intended semantics of sources and pipeline having similar lifetimes and being fairly coupled together. The difference between the pipeline and the sources is that the sources do not contribute to identifying the pipeline (they are not part of the hash for the pipeline id), and they could be swapped out without changing the output image (as long as they are valid). However, a pipeline without A sources object would not be useful, and typically the pipeline and the sources are generated, passed around and used together. This is different from the build environment and the secrets object, which both are specific to either the host or the caller, unlike the pipeline which should be universal. This changes the `load()` function to take a `manifest`, which is a map containing both the pipeline and the sources. Note that the semantics of the build-env parameter remains unchanged: It shares the sources with the rest of the pipeline. We may want to reconsider this in future commits, as the build-env is specific to the host, whereas the regular pipeline is not. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-19 15:59:11 +01:00
Tom Gundersen	7817ae5e8b	sources: add org.osbuild.files source This source adds support for downloaded files. The files are indexed by their content hash, and the only option is their URL. The main usecase for this will be downloading rpms. Allowing depsolving to be done outside of osbuild, network access to be restricted and downloaded rpms to be reused between runs. Each source is now passed two additional arguments, a cache directory and an output directory. Both are in the source's namespace, and the source is responsible for managing them. Each directory may contain contents from previous runs, but neither is ever guaranteed to do so. Downloaded contents may be saved to the cache and resued between runs, and the requested content should be written to the output dir. If secrets are used, the source must only ever write contents to the output that corresponds to the available secrets (rather than contents from the cache from previous runs). Each stage is passed an additional argument, a sources directory. The directory is read-only, and contains a subdirectory named after each used source, which will contain the requseted contents when the `Get()` call returns (if the source uses this functionality). Based on a patch by Lars Karlitski. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-06 19:01:12 +01:00
Christian Kellner	6f4d286ff4	osbuild: support for checkpoints during build Add a new `--checkpoint` option, which can be provided multiple times, that indicate after which stages a the current stage of the tree should be committed to the object store; the tree id will be the treesum of the tree at that point and a reference is created with the id of the stage at the point. The argument to `--checkpoint` is the id of the stage. If not all the given checkpoints can be found the execution will be aborted.	2020-02-06 16:10:35 +01:00
Tom Gundersen	ee86b57392	pipeline: back var by the store This makes sure all disk access is backed by the same disk. We may want this for performance reasons (avoiding moving across disks), but also to experiment with different backing stores for all disk access. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-01-27 15:51:47 +01:00
Christian Kellner	cf9c9946e0	pipeline: bind mount the osbuild module for the stages The dnf stage wants to import `osbuild.sources` but currently the osbuild module is not available in the stages. Apply the same hack done in the Assembler also in for the stages, i.e. bind mount the osbuild module to the stages/osbuild.	2020-01-23 00:49:11 +01:00
Lars Karlitski	e123715bc6	osbuild: introduce secrets Add a new command line option `--secrets`, which accepts a JSON file that is structured similarly to a source file. It is should contain data that is necessary to fetch content, but shouldn't appear in any logs.	2020-01-09 23:55:43 +01:00
Lars Karlitski	b9b2f99123	osbuild: create API sockets in the thread they're used in This might (hopefully) fix a race in destructing the asyncio.EventLoop that's used in all API classes, which leads to warnings about unhandled exceptions on CI. This also puts their creation closer to where the client-side sockets are created.	2019-12-25 17:48:26 +01:00
Lars Karlitski	510e2b1e94	osbuild: introduce sources Pipelines encode which source content they need in the form of repository metadata checksums (or rpm checksums). In addition, they encode where they fetch that source content from in the form of URLs. This is overly specific and doesn't have to be in the pipeline's hash: the checksum is enough to specify an image. In practice, this precluded using alternative ways of getting at source packages, such as local mirrors, which could speed up development. Introduce a new osbuild API: sources. With it, a stage can query for a way to fetch source content based on checksums. The first such source is `org.osbuild.dnf`, which returns repository configuration for a metadata checksum. Note that the dnf stage continues to verify that the content it received matches the checksum it expects. Sources are implemented as programs, living in a `sources` directory. They are run on the host (i.e., uncontained) right now. Each source gets passed options, which are taken from a new command line argument to osbuild, and an array of checksums for which to return content. This API is only available to stages right now.	2019-12-23 01:12:38 +01:00
Christian Kellner	ede3f6baeb	pipeline: proper object cleanup on errors The recent changes removed the {Assembler,Stage}Failed exceptions, which includes them being thrown from Stage.run and Assembler.run. Instead result dictionaries are returned even on errors. But the object store, used as a context manager, relies on exceptions to detect the error case and thus needs them to cleanup the temporary objects. Without those exceptions the temporary objects end up in the store even when the sage or assembler failed. Restore the old behavior by throwing a generic BuildError exception from the Stage and Assembler, which will be caught directly in the pipeline and converted to a result dict.	2019-12-18 12:45:59 +01:00
Lars Karlitski	61e32ff3ef	pipeline: return new-style result from build pipeline Commit `82a2be53d` introduced a new return type from `Pipeline.run()`. It changed the caller in `__main__.py`, but missed that the build pipeline uses the same function.	2019-12-15 12:03:43 +01:00
Lars Karlitski	82a2be53d4	pipeline: return logs in --json mode A pipeline run only returned logs in the `StageFailed` and `AssemblerFailed` exceptions. Remove those and always return structured data instead. It only returns data for stages that actually ran (i.e., didn't come from the cache). This is similar to the output in interactive mode. Also change osbuildtest to be able to deal with output that is larger than the pipe buffer by using subprocess.communicate().	2019-12-14 13:49:24 +01:00
Lars Karlitski	f0a7b2261e	pipeline: introduce host runner osbuild currently throws an error when not passing a build environment on the command line, because the runner is unset. This is annoying on hosts which only need a runner set, but no build pipeline. To simplify running osbuild in this common case, introduce `org.osbuild.host`, which is a runner that is defined to work on the host that osbuild is installed on. Use this runner by default and include a symlink to the right runner in the Fedora and RHEL packages. Also add `runners/org.osbuild.host` to `.gitignore`, so that developers can set the symlink when running osbuild from the source directory. Fixes #171	2019-12-02 13:45:48 +01:00
Lars Karlitski	64713449ce	Introduce runners We've been using a generic `osbuild-run`, which sets up the build environment (and works around bugs) for all build roots. It is already getting unwieldy, because it tries to detect the OS for some things it configures. It's also about to cause problems for RHEL, which doesn't currently support a python3 shebang without having /etc around. This patch changes the `build` key in a pipeline to not be a pipeline itself, but an object with `runner` and `pipeline` keys. `pipeline` is the build pipeline, as before. `runner` is the name of the runner to use. Runners are programs in the `runners` subdirectory. Three runners are included in this patch. They're copies of osbuild-run for now (except some additions for rhel82). The idea is that each of them only contains the minimal setup code necessary for an OS, and that we can review what's needed when updating a build root. Also modify the `--build-pipeline` command line switch to accept such a build object (instead of a pipeline) and rename it accordingly, to `--build-env`. Correspondingly, `OSBUILD_TEST_BUILD_PIPELINE` → `OSBUILD_TEST_BUILD_ENV`.	2019-11-25 13:05:22 +01:00
Lars Karlitski	616e1ecbba	buildroot: run everything with osbuild-run `osbuild-run` sets up the build root so that programs can be run correctly in it. It should be run for all programs, not just stages and assemblers (even though they're the only consumers right now). Also, conceptually, `osbuild-run` belongs to the build root. We'll change its implementation based on the build root in a future commit. The buildroot already sets up `/run/osbuild/api`. It makes sense to have it manage libdir as well. A nice side benefit of this is a simplification of the Stage and Assembler classes, which grew quite complex and contained duplicate code.	2019-11-25 13:05:22 +01:00
Christian Kellner	6e5b838892	pipeline: use API to setup stdio inside the container Use the new the osbuild API to setup the standard input/output inside the container, i.e. replace stdin, stdout, and stderr with sockets provided by the host.	2019-10-30 18:44:55 +01:00
Martin Sehnoutka	27cf84edd5	bind osbuild module from dynamically discovered path	2019-10-21 15:20:31 +02:00
Martin Sehnoutka	831459e9e9	fix execv /usr/lib/osbuild/osbuild-run does not exist In case osbuild is invoked without libdir parameter, the osbuild files are not propagated into the buildroot container and therefore all pipelines containing buildroot fail. Example: ``` $ sudo osbuild --store /var/osbuild/ qcow2-pipeline.json ... execv(/usr/lib/osbuild/osbuild-run) failed: No such file or directory ``` Unfortunately this is only the first error. Once you fix it, you realize that also the symlink from "assemblers" directory is missing and therefore you cannot import osbuild because it is not available anywhere in the path. This is why I had to bind the osbuild module from host to the build container.	2019-10-21 15:20:31 +02:00
Martin Sehnoutka	fa8de2f6d8	move files from /usr/libexec to /usr/lib There is no real difference in these two directories. Composer already uses /usr/lib, so OSBuild should use the same as well.	2019-10-02 15:01:01 +02:00
Ondřej Budai	adf5989de2	osbuild/pipeline: Fix crashes when running multiple builds at once Storytime! I tried to run multiple osbuilds at once. It failed when unmounting the buildtree. Weird. It turned out the buildtree was not there anymore when osbuild tried to unmount it. But who unmounted it? We need to deep dive into mount-types. Nowadays, the / directory is shared-mounted by systemd. See: https://serverfault.com/questions/868682/implications-of-mount-make-private This has interesting implications, see the following example: we start osbuild1 with /var/tmp/os1 as its store osbuild1 creates /var/tmp/os1/tmp osbuild1 bind-mounts / onto /var/tmp/os1/tmp we start osbuild2 with /var/tmp/os2 as its store osbuild2 creates /var/tmp/os2/tmp osbuild2 bind-mounts / onto /var/tmp/os2/tmp Now, the shared-mounting goes into effect: The second mount-event gets propagated into the first mount, where it creates another mount, so we get something like this: /var/tmp/os1/tmp/var/tmp/os2/tmp But this is just a start! Imagine running three osbuilds at once. The event would get propagated to those 3 mounts created by two osbuilds, creating 3 extra mounts, 7 in total. It turns out this mounting strategy creates an exponential number of mounts. Crazy, right? This commit mounts the root inside build root using private bind, which doesn't propagate bind-events. This solves the problem with the exponential growth. But the original problem was different, mount points were disappearing. So how does this fix solve the problem? Honestly, I don't know. Something with mount-event propagation is probably responsible, but I cannot imagine how it is actually affecting the unbinding.	2019-10-02 06:20:05 +02:00
Lars Karlitski	83475cc9f4	osbuild: store outputs in objectstore Treat outputs like we treat trees: store them in the object store. This simplifies using osbuild and allows returning a cached version if one is available. This makes the `--output` parameter redundant. Remove it.	2019-09-25 23:50:50 +02:00
Lars Karlitski	cb173f7d3c	objectstore: refer to objects, not trees Also simplify method names with redundant words: has_tree → contains get_tree → get new_tree → new	2019-09-25 23:50:50 +02:00
Lars Karlitski	635b041d84	pipeline: simplify return value of Pipeline.run() The current implementation was broken, because it didn't return results from the cached stages. Simpley return a boolean now, True for success.	2019-09-25 23:50:50 +02:00
Lars Karlitski	fd37a5d646	pipeline: introduce output id Introduce and output id, which is the checksum over a full pipeline, including all stages and the assembler. The id of a pipeline did not include assemblers before. To be less confusing, rename the existing id to "tree id".	2019-09-25 23:50:50 +02:00
Ondřej Budai	cf046fcaeb	osbuild: fix stages caching We have never tried to reuse the first stage due to fact that range in for loop didn't include zero index. This commit fixes it.	2019-09-03 22:11:54 +02:00
Tom Gundersen	ba6918f945	osbuild: allow additional an additional build-pipeline to be prepended The best practice for creating a pipeline should be to include at least one level of build-pipelines. This makes sure that the tools used to generate the target image are well-defined. In principle one could add several layers, though in pracite, one would hope that the envinment used to build the buildroot does not affect the final image (and as we anyway cannot recurr indefinitely, we fall back to simply using the host system in this case). This only makes sense, if the contents of the host system truly does not affect the generated image, and as such we do not include any information about the host when computing the hash that identifies a pipeline. In fact, any image could be used in its place, as long as the required tools are present. This commit takes advantage of that fact. Rather than run a pipeline with the host as the build root, take a second pipeline to generate the buildroot, but do not include this when computing the pipeline id (so it is different from simply editing the original JSON). This is necessary so we can use the same pipelines on significantly different host systems (run with different --bulid-pipeline arguments). In particular, it allows our test pipelines that generate f30 images to be run unmodified on Travis (which runs Ubuntu). Signed-off-by: Tom Gundersen <teg@jklm.no>	2019-08-30 12:00:47 +02:00
Tom Gundersen	679b79c5e5	osbuild: split package into separate files Import modules between files using the syntax `from . import foobar`, renaming what used to be `FooBar` to `foobar.FooBar` when moved to a separate file. In __init__.py only import what is meant to be public API. Signed-off-by: Tom Gundersen <teg@jklm.no>	2019-08-21 09:56:50 +04:00

41 commits