When developing or rebuilding manifests a lot it is common to want to
checkpoint everything to the store. It seems we all have small shell
scripts hanging around for this.
Let `--checkpoint` take a shell-like glob such as `--checkpoint="*"` to
checkpoint everything.
Note that there's a behavioral change here; previously `osbuild
--checkpoint=a` would error if that specific checkpoint wasn't found.
Now `osbuild` will only error if nothing was selected by the passed
globs.
`tox` is a standard testing tool for Python projects, this allows you to
test locally with all your installed Python version with the following
command:
`tox -m test -p all`
To run the tests in parallel for all supported Python versions.
To run linters or type analysis:
```
tox -m lint -p all
tox -m type -p all
```
This commit *also* disables the `import-error` warning from `pylint`,
not all Python versions have the system-installed Python libraries
available and they can't be fetched from PyPI.
Some linters have been added and the general order linters run in has
been changed. This allows for quicker test failure when running
`tox -m lint`. As a consequence the `test_pylint` test has been removed
as it's role can now be fulfilled by `tox`.
Other assorted linter fixes due to newer versions:
- use a str.join method (`consider-using-join`)
- fix various (newer) mypy and pylint issues
- comments starting with `#` and no space due to `autopep8`
This also changes our CI to use the new `tox` setup and on top of that
pins the versions of linters used. This might move into separate
requirements.txt files later on to allow for easier updating of those
dependencies.
Prior this commit, the arguments for the input service were passed inline.
However, jsoncomm uses the SOCK_SEQPACKET socket type underneath that has
a fixed maximum packet size. On my system, it's 212960 bytes. Unfortunately,
that's not enough for big inputs (e.g. when building packages with a lot
of rpms).
This commit moves all arguments to a temporary file. Then, just a file
descriptor is sent. Thus, we are now able to send arbitrarily sized args
for inputs, making osbuild work even for large image builds.
LOOP_CONFIGURE allows to atomically configure the decive when opening
it. This avoid the possibility of a race condition where between set_fd
and set_status some operations are already accepted by the loopback
device. See https://lwn.net/Articles/820408/
This feature was included in the linux kernel 5.8 however it is safe to
not include any kind of fallback to the previous method as @obudai
points out that:
LOOP_CONFIGURE was backported into RHEL 8 kernel in RHEL 8.4 as a part
of https://bugzilla.redhat.com/show_bug.cgi?id=1881760 (block layer:
update to upstream v5.8).
Since RHEL 8.4 is currently the oldest supported release that we support
running osbuild on, it might be just fine implementing this without the
fallback.
From a centos stream 8 container:
kernel-4.18.0-448.el8.x86_64
- loop: Fix missing discard support when using LOOP_CONFIGURE (Ming Lei) [1997338]
- [block] loop: Set correct device size when using LOOP_CONFIGURE (Ming Lei) [1881760]
- [block] loop: unset GENHD_FL_NO_PART_SCAN on LOOP_CONFIGURE (Ming Lei) [1881760]
- [block] loop: Add LOOP_CONFIGURE ioctl (Ming Lei) [1881760]
Fix the following errors:
```
osbuild/util/lvm2.py:117: error: Only instance methods can be decorated with @property
osbuild/api.py:50: error: Only instance methods can be decorated with @property
osbuild/sources.py:85: error: Only instance methods can be decorated with @property
```
Chaining of `@classmethod` and `@property` has been deprecated since
Python 3.11 with a note that chaining didn't work correctly in some
cases.
Relevant links:
https://github.com/python/mypy/issues/13746https://docs.python.org/3.11/whatsnew/3.11.html#language-builtins
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
To avoid kernel panics if the kernel attempts to recover the filesystem
when it's mounted as readonly. Offer the possiblity to use the
norecovery option for journaling file systems (Xfs, Ext4, Btrfs).
Before we could only ask OSBuild to mount a device as readonly. But
devices can have more mount options than this. Supporting more options
is necessary for the new version of image-info that is using OSBuild's
internals in order to mount the image it wants to work on. Otherwise,
for instance, some umasks aren't applied properly and we can get
differences in rpm-verify results, thus corrupting the DB.
Mount is now accepting:
* readonly
* uid
* gid
* umask
* shortname
The cachedir-tag specification defines how to mark directories as
cache-directories. This allows tools like `tar` to ignore those
directories if desired (e.g., see `tar --ignore-caches`). This is very
useful to avoid huge cache-directories in backups and remote
synchronizations.
The spec simply defines a file called `CACHEDIR.TAG` with the first 43
bytes to be: "Signature: 8a477f597d28d172789f06886806bc55" (which
happens to be the MD5-checksum of ".IsCacheDirectory". Further content
is to be ignored. Any such files marks the directory in question as a
cache-directory.
The cachedir-tag has been successfully deployed in tools like `cargo`
and `VLC`, and is currently discussed to be implemented in Firefox. More
information is available here: https://bford.info/cachedir/
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Add trace-hooks to the FsCache._atomic_open() helper, including a
primitive trace-infrastructure. They allow interrupting cache operation
and running arbitrary code.
The trace-hooks will be used by the test-suite to trigger the races we
want to protect against. During runtime, the traces should not be used
and thus will always be `None`.
This is a very primitive way to hook into the runtime execution and test
the atomicity of the operations. However, it is simple enough for our
tests and avoids pulling in huge tracing suites.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
On NFS, we need to be careful with cached metadata. To make sure our
_atomic_open() can correctly catch races during open+lock, we must be
careful to catch `ESTALE` and `ENOENT` from `stat()` calls. Otherwise,
the lock-acquisition guarantees that data is coherent, even on NFS.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
We used to commit cache-entries with a rename+RENAME_NOREPLACE. This,
however, is not available on NFS. Change the code to use `os.rename()`
and rely on the _documented_ kernel behavior that non-empty target
directories cannot be replaced.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
The `RENAME_NOREPLACE` option is not available on NFS. Avoid using it
in _atomic_file() to allow NFS backed storage.
If the caller allows replacing the destination entry, we simply use the
original `os.rename()` system call. This will unconditionally replace
the destination on all file-systems.
If the caller requests `no-replace`, we cannot use `os.rename()`.
Instead, we use `os.link()` to create a new hard-link on the
destination. This will always fail if the destination already exists.
We then rely on the cleanup-path to unlink the original temporary
entry.
This will require adjustments in future maintenance tasks on the cache,
since they need to be aware that entries can be hardlinked temporarily.
However, we already consider `uuid-*` entries in the object-store to be
temporary and unaccounted for similar reasons, so this doesn't even
break our cache-maintenance ideas.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Add a helper that copies an entire directory tree including all metadata
into the cache. Use it in the ObjectStore to commit entries.
Unlike FsCache.store() this does not require entering the context from
the call-site. Instead, all data is directly passed to the cache and the
operation is under full control of the cache.
The ObjectStore is adjusted to make use of this. This requires exposing
the root-path (rather than the tree-path) to be accessible for
individual objects, hence a `path`-@property is added alongside the
`tree`-@property. Note that `__fspath__` still refers to the tree-path,
since this is the only path really required for outside access other
than from the object-manager itself.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
The default value for `get()` is `None`, so no reason to specify it
explicitly. Simplify the respective calls in FsCache.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
This will lead to all mtimes that are newer than the creation time
of `tree` being clamped to `source_epoch`, if that was specified
for the pipeline. Specifically it means that all files that were
created during the build will be clamped to it. This should make
builds more reproducible.
When we commit objects to the store and there is a `source_epoch`
set on the `Object`, clamp the mtime. This is needed because it
is possible that the object corresponds to the last stage of a
pipeline[1] and it could later directly be exported without going
through `finalize` again. Also we are doing in on object itself
and not the cloned path so that resuming and checkpointing will
behave identical.
[1] not even necessarily the pipeline we are currently building.
Add a new `source_epoch` attribute that if set, will lead to all
mtimes that are newer or equal to the creation date being clamped
to the specified `source_epoch` time when the object is finalized.
When an new Object is created, save the creation time in a new
metadata entry called `info`. A new property called `created`
is added to inspect the creation date.
New utility function to clamp all mtimes of a given path to a
certain timestamp. Clamp here means that any timestamp later
than the specified upper bound will be set to the upper bound.
Add a new field to the cache-information called `version`, which is a
simple integer that is incremented on any backward-incompatible change.
The cache-implementation is modified to avoid any access to the cache
except for `<cache>/staging/`. This means, changes to the staging area
must be backwards compatible at all cost. Furthermore, it means we can
always successfully run osbuild even on possibly incompatible caches,
because we can always just ignore the cache and fully rely on the
staging area being accessible.
The `load()` method will always return cache-misses. The `store()`
method simply discards the entry instead of storing it. Note that
`store()` needs to provide a context to the caller, hence this
implementation simply creates another staging-context to provide to the
caller and then discard. This is non-optimal, but keeps the API simple
and avoids raising an exception to the caller (but this can be changed
if it turns out to be problematic or unwanted).
Lastly, the `cache.info` field behaves as usual, since this is also the
field used to read the cache-version. However, this file is never
written to improve resiliency and allow blacklisting buggy versions from
the past.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Integrate the recently added file system cache `FsCache` into our
object store `ObjectStore`. NB: This changes the semantics of it:
previously a call to `ObjectStore.commit` resulted in the object
being in the cache (i/o errors aside). But `FsCache.store`, which
is now the backing store for objects, will only commit objects if
there is enough space left. Thus we cannot rely that objects are
present for reading after a call to `FsCache.store`. To cope with
this we now always copy the object into the cache, even for cases
where we previously moved it: for the case where commit is called
with `object_id` matching `Object.id`, which is the case for when
`commit` is called for last stage in the pipeline. We could keep
this optimization but then we would have to special case it and
not call `commit` for these cases but only after we exported all
objects; or in other words, after we are sure we will never read
from any committed object again. The extra complexity seems not
worth it for the little gain of the optimization.
Convert all the tests for the new semantic and also remove a lot
of them that make no sense under this new paradigm.
Add a new command line option `--cache-max-size` which will set
the maximum size of the cache, if specified.