debian-forge

Author	SHA1	Message	Date
David Rheinsberg	53415a3cbc	pipeline: detect_os() -> describe_os() Rename the function to `describe_os()`. We do no actual detection, nor verification here. That is, the return value of this function is in no way guaranteed to be a valid runner. That is, error-handling needs to be done in the caller. Make this clear by renaming the function. Note: Currently, in case no runner exists for the OS, we end up with: execv(...) failed: No such file or directory This needs to be fixed in the future.	2020-02-29 12:35:19 +01:00
David Rheinsberg	cd07d588fc	pipeline: fix detect_os() default values The keys in `/etc/os-release` are not mandatory. Make sure we use their default values (defined in the man-page) if missing.	2020-02-29 12:35:19 +01:00
Christian Kellner	4b790ac284	objectstore: use a context also for Object.write Reading from an `Object` via `read` already uses a context manager to manage the read-only bind mount and also maintain a count of currently active readers. With this an attempt to start a new `write` operation while readers were active can be detected and an exception is throw. Since `write` was not introducing a context the inverted situation, i.e. reads while a write is ongoing, was not possible to detect. This commit therefore introduces a context also for `.write` so that we can enforce the policy to have either many readers but no writers, or just one writer and no readers. A bind mount is also used for write (in read-write mode) to hide the internal path of the tree.	2020-02-29 01:14:24 +01:00
Christian Kellner	2266d3fada	pipeline: plain results for stages, assembler The exception that was thrown by {stage.run, assembler.run} was a necessary ingredient that in combination with the context manager around `Objectstore.new` made sure that tree the object was only auto-committed to the store when there was no error during the executing of any of the `.run` methods. Now that the auto-commit feature got removed and committing of any object to the store is explicitly done via `objectstore.commit`, the whole exception throwing and handling can be removed. Status reporting was already done in `BuildResult.success` and the new code will use that to exit the function early on stage/asm errors.	2020-02-29 01:14:24 +01:00
Christian Kellner	29397efcec	pipeline: implement get_buildtree like store.get Refactor `get_buildtree` to do input/output via `Object`, i.e. by creating a new `Object`, setting its base accordingly and then use its `read` and `write` methods. This is what `ObjectStore.get` does as well. In the case that there is no build pipeline, use the mount helpers of `objectstore` instead of the custom mount calls.	2020-02-28 16:11:49 +01:00
Christian Kellner	42a365d12f	osbuild: no auto commit of the last stage Do not automatically commit the last stage of the pipeline to the store. The last stage is most likely not what should be cached, because it will contain all the individual customization and thus be very likely different for different users. Instead, the dnf or rpm stages have a higher chance of being the same and thus are better candidates for caching. Technically this change is done via two big changes that build upon new features introduces in the previous commits, most notably the copy on write semantics of Object and that input/output is being done via `objectstore.Object` instead of plain paths. The first of the two big changes is to create one new `Object` at the beginning of `pipeline.run` and use that, in write mode via `Object.write` across invocations of `stage.run` calls, with checkpoints being created after each stage on demand. The very same `Object` is then used in read mode via `Object.read` as the input tree for the Assembler. After the assembler is done the resulting image/tree is manually committed to the store. The other big change is to remove the `ObjectStore.commit` call from the `ObjectStore.new` method and thus the automatic commit after the last stage is gone. NB: since the build tree is being retrieved in `get_buildtree` from the store, a checkpoint for the last stage of the build pipeline is forced for now. Future commits will refactor will do away with that forced commit as well. Change osbuildtest.TestCase to always create a checkpoint at the final tree (the last stage of the pipeline), since tests need it to check the tree contents.	2020-02-28 16:11:49 +01:00
Christian Kellner	7720b5508d	objectstore: refactor .get() to use Object Instead of using custom bind-mount based logic in ObjectStore.get, use a combination of Object + `Object.read` with the supplied base (that can be None), which will lead to exactly the same outcome.	2020-02-28 16:11:49 +01:00
Christian Kellner	be8aafbb90	objectstore: Object.read() for read only access Provide a way to read the current contents of the object, in a way the follows the copy-on-write semantics: If `base` is set but the object has not yet been written to, the `base` content will be exposed. If no base is set or the object has been written to, the current (temporary) tree will be exposed. In either way it is done via a bind mount so it is assured that the contents indeed can only be read from, but not written to. The code also currently make sure that there is no write operation started as long as there is at least one reader. Additionally, also introduce checks that the object is intact, i.e. not cleaned up, for all operations that require such a state.	2020-02-28 16:11:49 +01:00
Christian Kellner	c73a28613b	objectstore: fix Object._open exception handling Move the call to os.open outside of the try block so that if an exception occurs it will be properly propagated to the callers.	2020-02-28 16:11:49 +01:00
Christian Kellner	007488488e	objectstore: extract mount code to small helpers Extract the mount code into small little helpers, intended to be reused from different places. Adapt ObjectStore to use those.	2020-02-28 16:11:49 +01:00
Christian Kellner	e610fa9659	objectstore: use private bind mounts for get() Use `--make-private` for the bind mount in `ObjectStore.get`.	2020-02-28 16:11:49 +01:00
Christian Kellner	9b61f50792	objectstore: make Object.open a private method Analogous to `_path`, it is not possible to identify the intended mode of the i/o operation from using `open` (whether it is a read or a write operation) and thus make it an internal method and only use it for read operations.	2020-02-28 16:11:49 +01:00
Christian Kellner	3258bb62d4	objectstore: make Object.path a private property Since it is hard to infer the intended modus of the i/o operation, i.e. whether it is going to be a read or a write from accessing the `path` property make it an internal method. Do not initialize the method on property access but return the writable tree, if Object is initialized, the path to its base tree otherwise. Adapt all the usage internally: Use `path` for read operations and initialize the object and then directly use `_tree` for write ops.	2020-02-28 16:11:49 +01:00
Christian Kellner	0ef5de3c94	objectstore: Object stores it base id not path Instead of storing the base path, store the object id of its base and resolve it to the path via the ObjectStore whenever needed.	2020-02-28 16:11:49 +01:00
Christian Kellner	6a2a7d99f7	objectstore: unify commit and snapshot code paths As a result of the previous commits that implement copy on write semantics, `commit` can now be used to create snapshots. Whenever an Object is committed, its tree is moved to the store and it is being reset, i.e. a new clean workdir is created and the old one discarded. The moved tree is then set as the base of the reset Object. On the next call to `write` the moved tree will be copied over and forms the basis of the Object again. Should nobody want to write to Object after the snapshot, i.e. the `commit`, no copy will be made. NB: snapshots/commits will act now act as synchronization points: if a object with the same treesum, i.e. the very same content already exists, the move (i.e. `store_tree`) will gracefully fail and the existing content will be set as the base for Object.	2020-02-28 16:11:49 +01:00
Christian Kellner	39213b7f44	objectstore: copy on write semantics for Object Since Object knows its base now, the initialization of the tree with the content of its base can be delayed until the moment someone wants to actually modify the tree, thus implementing copy on write semantics. For this a new `write` method is added that will initialize the base and return the writable tree. It should be used instead of `path` whenever the a client wants to write to the tree of the Object. Adapt the pipeline and the tests to use the new `write` method in all the appropriate places. NB: since the intention can not be inferred when using `path` directly, the Object is still being initialized there.	2020-02-28 16:11:49 +01:00
Christian Kellner	0874b80734	objectstore: Object knows its base When a new Object is created it can have a `base`, i.e. another object that is already committed to the store, which is then used to initialize the tree of the new object. That is, the contents of the new Object will be based on the contents of the existing. The initialization of an Object with its base (if any) was done by the ObjectStore. Move all of that logic inside `Object`: The Object will store its base, which `Object.init` will use to initialize itself. Additionally, if `Object.path` is accessed `init` is being called as well to make sure it is properly initialized, i.e. the tree initialized with the base content.	2020-02-28 16:11:49 +01:00
Christian Kellner	25b3807a5b	objectstore: snapshot takes Object not path Refactor the `ObjectStore.snapshot` method to take an `Object` not a plain filesystem tree, so the latter is more encapsulated from the ObjectStore user (e.g. the pipeline) and prepares a unified code-path for `snapshot` and `commit` in the future.	2020-02-28 16:11:49 +01:00
Christian Kellner	5deb1be514	objectstore: change Object.move to .store_tree Now that Object manages its work directory itself, re-create the latter when the its tree is moved, i.e. when the object is being committed to the store. This means that after the object has been written to the store it is in the same state is if it was new and can be used in the very same way. If the move itself fails (the rename(2) fails), the tree and its contents is cleaned up with the reset of the work directory. Rename the `move` method to `store_tree` to better reflect how the method should be used, i.e. to store the tree corresponding to the Object instance.	2020-02-28 16:11:49 +01:00
Christian Kellner	6d14dee9a2	objectstore: object manages its work dir When a new object is being created, a corresponding temporary directory is created within the file system tree of the object store, which shall be called the "work dir". Within that dir a well-known directory ('tree') is created that then is the root of the filesystem tree[1] that clients use to store the tree or the resulting image in. Previously, the work dir was managed, i.e. created and cleaned up (via a context manager) by the ObjectStore. Now the Object itself manages the tree and thus the lifetime of the work dir is more directly integrated and controlled by it. As a result the Object itself is now a context manager. On exit of the context the work dir is cleaned up. [1] For the assembler this is the output directory that will contain the final image.	2020-02-28 16:11:49 +01:00
Christian Kellner	399606528c	objectstore: helper to create temp dirs inside the store Create a small helper method that creates a new temporary directory of type tempfile.TemporaryDirectory within the store and returns it.	2020-02-28 16:11:49 +01:00
Christian Kellner	d10537da42	objectstore: yield Object not path from .new() Instead of just returning the path of the temporary object that is created in .new() the actual instance of the new `Object` is being returned, which can then provide a richer interface for clients than a plain directory path.	2020-02-28 16:11:49 +01:00
Christian Kellner	52736169f1	objectstore: Object keeps reference to store Keep are reference to the parent store, which this object is tied to. It is currently not yet used directly but is a preparation for a closer Object and ObjectStore integration that will happen in commits to follow.	2020-02-28 16:11:49 +01:00
Christian Kellner	19f49e5dc3	objectstore: rename TreeObject to Object As the name implies, the ObjectStore stores objects, which can be trees but also everything an Assembler can make of the input tree, like qcow2 images, tarballs and other non tree-like outputs. Therefore rename the TreeObject to Object to better reflect that it is representing any object, not only trees, in the store.	2020-02-28 16:11:49 +01:00
Christian Kellner	3b7c87d563	objectstore: helper to resolve references to paths Introduce a small helper function to resolve object_id references into their paths inside the object store and use that throughout the store.	2020-02-28 16:11:49 +01:00
Lars Karlitski	a578a2b7e7	pipeline: detect host instead of using org.osbuild.host Detect the host dynamically from os-release(5) instead of relying on the `org.osbuild.host` symlink. It is awkward to install a symlink that tells osbuild which distro is is running on, when there is a standard way to detect this. This makes it easier to run osbuild from sources and removes the need to include every host in the spec file. The latter became hard to do, because there's no obvious way to distinguish RHEL minor releases.	2020-02-28 16:06:30 +01:00
Tom Gundersen	e48c2f178c	osbuild: allow the sources to be passed in on stdin Currently stdin is taken to be the pipeline to be built, this allows it to be instead a map containing the suorces and the pipeline. We would imagine passing around the sources and pipeline together, so this just makes the behavior of osbuild more closely match the intended use and semantics of the sources configuration. This keeps backwards compatibility for now, but that may be dropped as soon as osbuild-composer no longer relies on the old behavior. Disable too-many-{branches,statements} pylint warnings in __main__.py. These do not seem helpful, but could be reenabled if we drop some options in the future. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-19 15:59:11 +01:00
Tom Gundersen	481213a8dd	pipeline: pin the sources options in the pipeline object Make the sources options a static property of the pipeline, in particular of each stage, rather than being passed in on `run()`. This more closely matches the intended semantics of sources and pipeline having similar lifetimes and being fairly coupled together. The difference between the pipeline and the sources is that the sources do not contribute to identifying the pipeline (they are not part of the hash for the pipeline id), and they could be swapped out without changing the output image (as long as they are valid). However, a pipeline without A sources object would not be useful, and typically the pipeline and the sources are generated, passed around and used together. This is different from the build environment and the secrets object, which both are specific to either the host or the caller, unlike the pipeline which should be universal. This changes the `load()` function to take a `manifest`, which is a map containing both the pipeline and the sources. Note that the semantics of the build-env parameter remains unchanged: It shares the sources with the rest of the pipeline. We may want to reconsider this in future commits, as the build-env is specific to the host, whereas the regular pipeline is not. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-19 15:59:11 +01:00
Tom Gundersen	16dfd7eec1	remoteloop: drop O_DIRECT Appart from giving us a hard time on s390x, this feature did not seem to have a measurable effect. Moreover, O_DIRECT is not supported by tmpfs so without this patch we could not use tmpfs as backing store, which does speed up image generation considerably. Drop the flag and and rather put the store on tmpfs in order to speed things up.	2020-02-06 19:01:12 +01:00
Tom Gundersen	7817ae5e8b	sources: add org.osbuild.files source This source adds support for downloaded files. The files are indexed by their content hash, and the only option is their URL. The main usecase for this will be downloading rpms. Allowing depsolving to be done outside of osbuild, network access to be restricted and downloaded rpms to be reused between runs. Each source is now passed two additional arguments, a cache directory and an output directory. Both are in the source's namespace, and the source is responsible for managing them. Each directory may contain contents from previous runs, but neither is ever guaranteed to do so. Downloaded contents may be saved to the cache and resued between runs, and the requested content should be written to the output dir. If secrets are used, the source must only ever write contents to the output that corresponds to the available secrets (rather than contents from the cache from previous runs). Each stage is passed an additional argument, a sources directory. The directory is read-only, and contains a subdirectory named after each used source, which will contain the requseted contents when the `Get()` call returns (if the source uses this functionality). Based on a patch by Lars Karlitski. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-06 19:01:12 +01:00
Tom Gundersen	794ec97bf3	api: add barriers Ensure that the api sockets are created before entering the with clause. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-02-06 19:01:12 +01:00
Christian Kellner	5a61d8c869	objectstore: extract method to open a TreeObject Extract the opening a TreeObject out of the treesum property so that the latter is easier to read.	2020-02-06 16:10:35 +01:00
Christian Kellner	6f4d286ff4	osbuild: support for checkpoints during build Add a new `--checkpoint` option, which can be provided multiple times, that indicate after which stages a the current stage of the tree should be committed to the object store; the tree id will be the treesum of the tree at that point and a reference is created with the id of the stage at the point. The argument to `--checkpoint` is the id of the stage. If not all the given checkpoints can be found the execution will be aborted.	2020-02-06 16:10:35 +01:00
Christian Kellner	ce5719a03f	objectstore: move tree-moving code into the tree The code to move the a TreeObject more naturally belongs to the TreeObject itself and makes the ObjectStore.commit() method even easier to read.	2020-02-06 16:10:35 +01:00
Christian Kellner	b5b5e7be29	objectstore: also ignore EEXIST when committing When the tree is committed to the objects directory of the object- store, it is done via rename(3). The two possible errors that can be raised in case that a non-empty tree with the same name already exist is [EEXIST] or [ENOTEMPTY]. The latter was already ignored but the former was not. At least on btrfs former will be raised File "/home/gicmo/Code/src/osbuild/osbuild/objectstore.py", os.rename(tree.root, output_tree) FileExistsError: [Errno 17] File exists: 'store/tmpyyi3yvie/tree' -> 'store/objects/…'	2020-02-06 16:10:35 +01:00
Christian Kellner	3a40d31bee	objectstore: introduce tree snapshot support Add a new method to the ObjectStore that takes a path to a file system tree, which is currently being built, and commits it to the store and references it via a given object_id. The tree is copied to a temporary location (co-located in the store to enable fast copying via reflinks) and then atomically moved into the ObjectStore's objects path via rename(3).	2020-02-06 16:10:35 +01:00
Christian Kellner	db8618f192	objectstore: extract logic to commit a tree Extract the code from ObjectStore.new that will commit the filled tree to the store into its own method so it can be used from a future method to snapshot trees at random points in time.	2020-02-06 16:10:35 +01:00
Christian Kellner	4831927e84	objectstore: introduce TreeObject Introduce a small `TreeObject` class that is the representation of a tree during its construction. It supports calculating its treesum as well initialize the new tree with an existing one.	2020-02-06 16:10:35 +01:00
Tom Gundersen	ee86b57392	pipeline: back var by the store This makes sure all disk access is backed by the same disk. We may want this for performance reasons (avoiding moving across disks), but also to experiment with different backing stores for all disk access. Signed-off-by: Tom Gundersen <teg@jklm.no>	2020-01-27 15:51:47 +01:00
Tom Gundersen	2837604bf8	buildroot: allow customizing the backing store for /var Currently /var was always backed by /var/tmp, but we may want to control exactly what it is backed by. The default is the same, so this is not a behavioral change.	2020-01-27 15:51:47 +01:00
Christian Kellner	cf9c9946e0	pipeline: bind mount the osbuild module for the stages The dnf stage wants to import `osbuild.sources` but currently the osbuild module is not available in the stages. Apply the same hack done in the Assembler also in for the stages, i.e. bind mount the osbuild module to the stages/osbuild.	2020-01-23 00:49:11 +01:00
Lars Karlitski	7bb06d2334	loop: handle set_status returning EBUSY This happens rarely when the same loop device is used in rapid succession. The kernel flushes the page cache asynchronously, which means that it might not be cleared yet when a new file is bound. `set_status` checks if the cache is clear (`set_fd` doesn't). Handle this by trying a different device when `set_status` returns `EBUSY`. Fixes #177	2020-01-19 22:19:25 +01:00
Lars Karlitski	b487126bb8	loop: explicitly close fds to loop devices Don't wait until python's garbage collector closes the file descriptors to loop devices. Close them when the `LoopServer` context manager exits, after an assembler has finished running.	2020-01-19 22:19:25 +01:00
Lars Karlitski	47dc1b5b92	loop: don't leak open fd to /dev Close the file descriptor to `/dev` when we opened it.	2020-01-19 22:19:25 +01:00
Lars Karlitski	977f0a465b	loop: fix typo in LoopInfo member	2020-01-19 22:19:25 +01:00
Christian Kellner	64addbe2d2	buildroot: allow creating device nodes on s390x The z Initial Program Loader (zipl) when creating the bootmap in bootmap_creat (src/zipl/bootmap.c) wants to create a device node via misc_temp_dev (bootmap_create:1141) for the device that it is installing the bootloader to[1]. Currently access to loopback devices is allowed from within the container (it is used to mount the image), but only read/write access. On s390x also allow the creation of device nodes, so zipl can do its work and install the bootloader stages on the "disk". [1] zipl source at commit dcce14923c3e9615df53773d1d8a3a22cbb23b96	2020-01-13 20:05:10 +01:00
Christian Kellner	bf41326ac6	remoteloop: don't use O_DIRECT on s390x Using O_DIRECT to open the image partition and then using that fd for the backing of the loopback device will break the mounting of the formatted partition, i.e mount will fail with: mount: /tmp/looptest-6qrtkp5e/mountpoint-root: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. Reproducible with the follow small-ish python script, executed via 'env PYTHONPATH=$(pwd) python3 looptest.py': ---- 8< ---- 8< ---- [ looptest.py ] ---- 8< ---- 8< ---- import contextlib import json import os import subprocess import stat import tempfile from osbuild import loop @contextlib.contextmanager def mount(source, dest): subprocess.run(["mount", source, dest], check=True) try: yield dest finally: subprocess.run(["umount", "-R", dest], check=True) @contextlib.contextmanager def os_open(path, flags): fd = os.open(path, flags) try: yield fd finally: os.close(fd) def main(): size = 512 * 1024 * 1024 ptuuid = "0x14fc63d2" with contextlib.ExitStack() as cm: tmpdir = cm.enter_context(tempfile.TemporaryDirectory(prefix="looptest-")) print(f"Temporary directory at {tmpdir}") devdir = os.path.join(tmpdir, "dev") os.makedirs(devdir, exist_ok=True) dir_fd = cm.enter_context(os_open(devdir, os.O_DIRECTORY)) image = os.path.join(tmpdir, "image") subprocess.run(["truncate", "--size", str(size), image], check=True) table = f"label: mbr\nlabel-id: {ptuuid}\nbootable, type=83" subprocess.run(["sfdisk", image], input=table, encoding='utf-8', check=True) # read it back r = subprocess.run(["sfdisk", "--json", image], stdout=subprocess.PIPE, encoding='utf-8', check=True) table = json.loads(r.stdout)["partitiontable"] partitions = table["partitions"] start = partitions[0]["start"] * 512 size = partitions[0]["size"] * 512 # fails here with os.O_DIRECT image_fd = cm.enter_context(os_open(image, os.O_RDWR \| os.O_DIRECT)) control = loop.LoopControl() minor = control.get_unbound() lo = loop.Loop(minor) lo.set_fd(image_fd) lo.set_status(offset=start, sizelimit=size, autoclear=True) lo.mknod(dir_fd) loopdev = f"/dev/loop{minor}" # loopdev = os.path.join(devdir, lo.devname) # os.chmod(loopdev, os.stat(loopdev).st_mode \| stat.S_IRGRP) subprocess.run(["ls", "-la", f"{devdir}"], check=True) subprocess.run(["mkfs.ext4", loopdev], input="y", encoding='utf-8', check=True) subprocess.run(["blkid", loopdev], check=True) mountpoint = os.path.join(tmpdir, "mountpoint-root") os.makedirs(mountpoint, exist_ok=True) cm.enter_context(mount(loopdev, mountpoint)) subprocess.run(["ls", "-la", tmpdir], check=True) subprocess.run(["ls", "-la", mountpoint], check=True) subprocess.run(["mount"], check=True) if __name__ == '__main__': main()	2020-01-13 20:05:10 +01:00
Lars Karlitski	12b5c6aaa4	sources: bump maximum message size to 64k These messages contain certificate data, which is quite large. We should probably use streaming sockets in the future.	2020-01-09 23:55:43 +01:00
Lars Karlitski	e123715bc6	osbuild: introduce secrets Add a new command line option `--secrets`, which accepts a JSON file that is structured similarly to a source file. It is should contain data that is necessary to fetch content, but shouldn't appear in any logs.	2020-01-09 23:55:43 +01:00
Lars Karlitski	02ad4e3810	sources: fail gracefully when a source returns invalid JSON Include the actual output of the source to help debugging.	2020-01-09 23:55:43 +01:00

... 2 3 4 5 6 ...

310 commits