The `Object.{read,write}` methods were introduced to implement
copy on write support. Calling `write` would trigger the copy,
if the object had a `base`. Additionally, a level of indirection
was introduced via bind mounts, which allowed to hide the actual
path of the object in the store and make sure that `read` really
returned a read-only path.
Support for copy-on-write was recently removed[1], and thus the
need for the `read` and `write` methods. We lose the benefits
of the indirection, but they are not really needed: the path to
the object is not really hidden since one can always use the
`resolve_ref` method to obtain the actual store object path.
The read only property of build trees is ensured via read only
bind mounts in the build root.
Instead of using `read` and `write`, `Object` now gained a new
`tree` property that is the path to the objects tree and also
is implementing `__fspath__` and so behaves like an `os.PathLike`
object and can thus transparently be used in many places, like
e.g. `os.path.join` or `pathlib.Path`.
[1] 5346025031
If the object's id does not match with the one supplied for the
commit, we create a clone. Otherwise we store the tree.
The code path is arranged in a way that we always go through
`Object.store_tree` so we always call `Object.finalize` as a
prepration for the future, where we might actually do something
meaningful in the finalizer, like reset the *times or count the
tree size.
Remove copy-on-write support from `objectstore.Object`. The main
reason for introducing copy-on-write was to save an additional
copy in the non DAG-pipeline model[1]. With the introduction of
the latter and the explicit `--export` option, we can achieve the
same result without the complexity of copy-on-write semantics.
[1] See commit 39213b7, part of 3b7c87d5..42a365d1 changeset.
There is little use in sharing the store between test, quite to
opposite: all tests expect a clean store and some currently set
that up themselves. Create a fresh store for each test.
The idea of this test case was to check that two identical trees are
only stored once, via their treesum in the object store; but this
functionality was removed in commit e97f6ef34 and instead of treesums
random uuids are now used. As a result there is no de-duplication
anymore -- the subject of the test. So remove the test.
The treesum of a filesystem tree is the content hash of all its
files, its directory structure and file metadata.
By storing trees by their treesum we avoid storing duplicates of
identical trees, at the cost of computing the hashes for every
commit to the store.
This has limited benefit as the likelihood of two trees being
identical is slim, in particular when we already have the ability
to cache based on pipeline/stage ID (i.e., we can avoid rebuilding
trees if the pipelines that built them were the same).
Drop the concept of a treesum entirely, even though I very much
liked the idea in theory...
Signed-off-by: Tom Gundersen <teg@jklm.no>
Add the ability to only read a sub-tree of a tree via `Object.read_at`.
Expose the functionality via the `Store{Server,Client}.read_tree_at`.
Extend the tests to check this new functionality.
Instead of using string interpolation and concatenation to build
file system paths, use `os.path.join` or directly the constructor
for `pathlib.Path`, which can take path segments.
Move `test_objectstore` into the module-level tests. This allows us to
run it as part of `make test-module.
Make sure to properly guard it as root-only module.