Commit graph

11 commits

Author SHA1 Message Date
Brian C. Lane
44c28c8c16 autopep8: Update with changes to make autopep8 -a -a -a happy 2023-08-10 13:04:14 +02:00
David Rheinsberg
18c69d2620 util/fscache: add cachedir-tag support
The cachedir-tag specification defines how to mark directories as
cache-directories. This allows tools like `tar` to ignore those
directories if desired (e.g., see `tar --ignore-caches`). This is very
useful to avoid huge cache-directories in backups and remote
synchronizations.

The spec simply defines a file called `CACHEDIR.TAG` with the first 43
bytes to be: "Signature: 8a477f597d28d172789f06886806bc55" (which
happens to be the MD5-checksum of ".IsCacheDirectory". Further content
is to be ignored. Any such files marks the directory in question as a
cache-directory.

The cachedir-tag has been successfully deployed in tools like `cargo`
and `VLC`, and is currently discussed to be implemented in Firefox. More
information is available here: https://bford.info/cachedir/

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:43 +01:00
David Rheinsberg
51d0f60843 util/fscache: add trace hooks
Add trace-hooks to the FsCache._atomic_open() helper, including a
primitive trace-infrastructure. They allow interrupting cache operation
and running arbitrary code.

The trace-hooks will be used by the test-suite to trigger the races we
want to protect against. During runtime, the traces should not be used
and thus will always be `None`.

This is a very primitive way to hook into the runtime execution and test
the atomicity of the operations. However, it is simple enough for our
tests and avoids pulling in huge tracing suites.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
290efe50fe util/fscache: make _atomic_open() NFS compatible
On NFS, we need to be careful with cached metadata. To make sure our
_atomic_open() can correctly catch races during open+lock, we must be
careful to catch `ESTALE` and `ENOENT` from `stat()` calls. Otherwise,
the lock-acquisition guarantees that data is coherent, even on NFS.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
144e0126a3 util/fscache: drop unused _libc
We no longer use the direct libc accessor, so drop it.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
2c18a54e4d util/fscache: avoid RENAME_NOREPLACE on commit
We used to commit cache-entries with a rename+RENAME_NOREPLACE. This,
however, is not available on NFS. Change the code to use `os.rename()`
and rely on the _documented_ kernel behavior that non-empty target
directories cannot be replaced.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
e6b77ac7df util/fscache: avoid RENAME_NOREPLACE in _atomic_file()
The `RENAME_NOREPLACE` option is not available on NFS. Avoid using it
in _atomic_file() to allow NFS backed storage.

If the caller allows replacing the destination entry, we simply use the
original `os.rename()` system call. This will unconditionally replace
the destination on all file-systems.

If the caller requests `no-replace`, we cannot use `os.rename()`.
Instead, we use `os.link()` to create a new hard-link on the
destination. This will always fail if the destination already exists.
We then rely on the cleanup-path to unlink the original temporary
entry.

This will require adjustments in future maintenance tasks on the cache,
since they need to be aware that entries can be hardlinked temporarily.
However, we already consider `uuid-*` entries in the object-store to be
temporary and unaccounted for similar reasons, so this doesn't even
break our cache-maintenance ideas.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
8a9efa89fc util/fscache: provide store_tree() helper
Add a helper that copies an entire directory tree including all metadata
into the cache. Use it in the ObjectStore to commit entries.

Unlike FsCache.store() this does not require entering the context from
the call-site. Instead, all data is directly passed to the cache and the
operation is under full control of the cache.

The ObjectStore is adjusted to make use of this. This requires exposing
the root-path (rather than the tree-path) to be accessible for
individual objects, hence a `path`-@property is added alongside the
`tree`-@property. Note that `__fspath__` still refers to the tree-path,
since this is the only path really required for outside access other
than from the object-manager itself.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
50f8f6ac47 util/fscache: simplify get(*, None)
The default value for `get()` is `None`, so no reason to specify it
explicitly. Simplify the respective calls in FsCache.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-20 16:56:32 +01:00
David Rheinsberg
ef20b40faa util/fscache: introduce versioning
Add a new field to the cache-information called `version`, which is a
simple integer that is incremented on any backward-incompatible change.

The cache-implementation is modified to avoid any access to the cache
except for `<cache>/staging/`. This means, changes to the staging area
must be backwards compatible at all cost. Furthermore, it means we can
always successfully run osbuild even on possibly incompatible caches,
because we can always just ignore the cache and fully rely on the
staging area being accessible.

The `load()` method will always return cache-misses. The `store()`
method simply discards the entry instead of storing it. Note that
`store()` needs to provide a context to the caller, hence this
implementation simply creates another staging-context to provide to the
caller and then discard. This is non-optimal, but keeps the API simple
and avoids raising an exception to the caller (but this can be changed
if it turns out to be problematic or unwanted).

Lastly, the `cache.info` field behaves as usual, since this is also the
field used to read the cache-version. However, this file is never
written to improve resiliency and allow blacklisting buggy versions from
the past.

Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
2022-12-15 08:55:39 +01:00
David Rheinsberg
4df05b8509 util: add file system cache
This commit introduces a new utility module called `fscache`. It
implements a cache module that stores data on the file system. It
supports parallel access and protects data with file-system locks. It
provides three basic functions:

    FsCache.load("<name>"):
        Loads the cache entry with the specified name, acquires a
        read-lock and yields control to the caller to use the entry.
        Once control returns, the entry is unlocked again.

        If the entry cannot be found, a cache miss is signalled via
        FsCache.MissError.

    FsCache.store("<name>"):
        Creates a new anonymous cache entry and yields control to the
        caller to fill in. Once control returns, the entry is renamed
        to the specified name, thus committing it to the object store.

    FsCache.stage():
        Create a new anonymous staging entry and yield control to the
        caller. Once control returns, the entry is completely
        discarded.

        This is primarily used to create a working directory for osbuild
        pipeline operations. The entries are volatile and automatic
        cleanup is provided.

        To commit a staging entry, you would eventually use
        FsCache.store() and rename the entire data directory into the
        non-volatile entry. If the staging area and store are on
        different file-systems, or if the data is to be retained for
        further operations, then the data directory needs to be copied.

Additionally, the cache maintains a size limit and discards any entries
if the limit is exceeded. Future extensions will implement cache pruning
if a configured watermark is reached, based on last-recently-used
logics.

Many more cache extensions are possible. This module introduces a first
draft of the most basic cache and hopefully lays ground for a new cache
infrastructure.

Lastly, note that this only introduces the utility helper. Further work
is required to hook it up with osbuild/objectstore.py.
2022-12-06 09:48:38 +01:00