This commit adds code that will remove the least recently used
entries when a store() operation does not succeeds because the
cache is full. To be more efficient it will try to free
twice the requested size (this can be configured in the code).
This helper can be used to implement a strategy to find the oldest
cache entries and evict them when the cache is full.
The implementation uses the `atime` of the per object `cache.lock`
file and ensures in `load()` that it's actually updated.
The cachedir-tag specification defines how to mark directories as
cache-directories. This allows tools like `tar` to ignore those
directories if desired (e.g., see `tar --ignore-caches`). This is very
useful to avoid huge cache-directories in backups and remote
synchronizations.
The spec simply defines a file called `CACHEDIR.TAG` with the first 43
bytes to be: "Signature: 8a477f597d28d172789f06886806bc55" (which
happens to be the MD5-checksum of ".IsCacheDirectory". Further content
is to be ignored. Any such files marks the directory in question as a
cache-directory.
The cachedir-tag has been successfully deployed in tools like `cargo`
and `VLC`, and is currently discussed to be implemented in Firefox. More
information is available here: https://bford.info/cachedir/
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Add a helper that copies an entire directory tree including all metadata
into the cache. Use it in the ObjectStore to commit entries.
Unlike FsCache.store() this does not require entering the context from
the call-site. Instead, all data is directly passed to the cache and the
operation is under full control of the cache.
The ObjectStore is adjusted to make use of this. This requires exposing
the root-path (rather than the tree-path) to be accessible for
individual objects, hence a `path`-@property is added alongside the
`tree`-@property. Note that `__fspath__` still refers to the tree-path,
since this is the only path really required for outside access other
than from the object-manager itself.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Add a new field to the cache-information called `version`, which is a
simple integer that is incremented on any backward-incompatible change.
The cache-implementation is modified to avoid any access to the cache
except for `<cache>/staging/`. This means, changes to the staging area
must be backwards compatible at all cost. Furthermore, it means we can
always successfully run osbuild even on possibly incompatible caches,
because we can always just ignore the cache and fully rely on the
staging area being accessible.
The `load()` method will always return cache-misses. The `store()`
method simply discards the entry instead of storing it. Note that
`store()` needs to provide a context to the caller, hence this
implementation simply creates another staging-context to provide to the
caller and then discard. This is non-optimal, but keeps the API simple
and avoids raising an exception to the caller (but this can be changed
if it turns out to be problematic or unwanted).
Lastly, the `cache.info` field behaves as usual, since this is also the
field used to read the cache-version. However, this file is never
written to improve resiliency and allow blacklisting buggy versions from
the past.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
This commit introduces a new utility module called `fscache`. It
implements a cache module that stores data on the file system. It
supports parallel access and protects data with file-system locks. It
provides three basic functions:
FsCache.load("<name>"):
Loads the cache entry with the specified name, acquires a
read-lock and yields control to the caller to use the entry.
Once control returns, the entry is unlocked again.
If the entry cannot be found, a cache miss is signalled via
FsCache.MissError.
FsCache.store("<name>"):
Creates a new anonymous cache entry and yields control to the
caller to fill in. Once control returns, the entry is renamed
to the specified name, thus committing it to the object store.
FsCache.stage():
Create a new anonymous staging entry and yield control to the
caller. Once control returns, the entry is completely
discarded.
This is primarily used to create a working directory for osbuild
pipeline operations. The entries are volatile and automatic
cleanup is provided.
To commit a staging entry, you would eventually use
FsCache.store() and rename the entire data directory into the
non-volatile entry. If the staging area and store are on
different file-systems, or if the data is to be retained for
further operations, then the data directory needs to be copied.
Additionally, the cache maintains a size limit and discards any entries
if the limit is exceeded. Future extensions will implement cache pruning
if a configured watermark is reached, based on last-recently-used
logics.
Many more cache extensions are possible. This module introduces a first
draft of the most basic cache and hopefully lays ground for a new cache
infrastructure.
Lastly, note that this only introduces the utility helper. Further work
is required to hook it up with osbuild/objectstore.py.