Introduce a new class `SpdxLicenseExpressionCreator`, responsible for
converting license texts extracted from packages, into an SPDX-compliant
license expressions. If the `license_expression` Python package is
available on the system, it is used to determine the license text
extracted from a package is a valid SPDX license expression. If it is,
it's returned as is back to the caller. If it is not, or of the package
is not available on the system, the license text is wrapped in a
`ExtractedLicensingInfo` instance.
The `SpdxLicenseExpressionCreator` object keeps track of all generated
`ExtractedLicensingInfo` instances and de-duplicates them based on the
license text. This means that if two packages use the same
SPDX-non-compliant license text, they will be wrapped by an
`ExtractedLicensingInfo` instance with the same `LicenseRef-` ID.
The reason for fallback when `license_expression` package is not
available is that it is not available on RHEL and CentOS Stream. This
implementation allows us to ship the functionality in RHEL and
optionally enabling it by installing `license_expression` from a 3rd
party repository. In any case, the generated SBOM document will always
contain valid SPDX license expressions.
Extend unit tests to cover the newly added functionality.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
FIXUP: sbom/spdx: use compliant license expressions
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Add a new optional pytest CLI argument `--unsupported-fs` allowing to
specify file-systems which should be treated as unsupported in the
platform where running tests. Any test cases dependent on such
file-system support will be sipped.
This will allow to run unit tests and selectively skipping test cases
for unsupported file-systems.
Signed-off-by: Tomáš Hozza <thozza@redhat.com>
Integrate the recently added file system cache `FsCache` into our
object store `ObjectStore`. NB: This changes the semantics of it:
previously a call to `ObjectStore.commit` resulted in the object
being in the cache (i/o errors aside). But `FsCache.store`, which
is now the backing store for objects, will only commit objects if
there is enough space left. Thus we cannot rely that objects are
present for reading after a call to `FsCache.store`. To cope with
this we now always copy the object into the cache, even for cases
where we previously moved it: for the case where commit is called
with `object_id` matching `Object.id`, which is the case for when
`commit` is called for last stage in the pipeline. We could keep
this optimization but then we would have to special case it and
not call `commit` for these cases but only after we exported all
objects; or in other words, after we are sure we will never read
from any committed object again. The extra complexity seems not
worth it for the little gain of the optimization.
Convert all the tests for the new semantic and also remove a lot
of them that make no sense under this new paradigm.
Add a new command line option `--cache-max-size` which will set
the maximum size of the cache, if specified.
Require all the tests that compile a manifest to either specify
checkpoints or exports. Convert all the tests that were relying
on implicit exports with v1 manifests to use explicit exports.
This applies the default authconfig settings to the tree.
Note that the `/backups` directory is removed. The tool creaset
this, and by default it should not exist, so this should be a
noop. However, if you run this on a tree with existing backups,
they would be lost.
In the function `treeid_from_manifest`, use dynamic format detection
instead of hard-coding the use of the version 1 format. Additionally,
directly access the `tree` pipeline and its `id` instead of going
via the `get_ids` helper function, which is only present in v1.py.
Remove the dependency on unittest for the `OSBuild` class which
used the `unittest` instance only for `assertEqual`, which can
easily also be done via a plain `assert`.
Add a new `info` property that holds the `meta.ModuleInfo` info
for the stage. This gives each instance of a stage access to
meta (or class) information about it, i.e. its schema, docs but,
more importantly, also its name and path to the executable.
Thefore the `name` property is coverted into a transient property
which access the `name` member of `info`.
Change the `formats/v1` load mechanism to carry a new `index`
argument which is used to load the `ModuleInfo` for each stage.
Adapt all tests to load the info as well when creating stages.
Instead of having build pipelines nested within the pipeline it is
the build pipeline for, the nested structure is transferred into a
flat list of pipelines. As a result the recursion is gone and all
the pipelines and trees are build one after the other. This is now
possible since floating objects are kept alive by the store itself
and all trees that are being built are transparently via them.
The immediate result dictionary changed accordingly. To keep the
JSON output of osbuild the same, the result is now routed through
a format specific converter.
Additionally, the v1 format module gained a function to retrieve
the global tree_id and output_id. With the new models those global
ids will go away eventually and thus need to go through the format
specific code.
The 'Manifest' class represents what to build and the necessary
sources to do so. For now thus it is just a combination of the
pipeline the source options.
Instead of having the pipeline and the source option as separate
arguments, the load function now takes the full manifest, which
has those two items combined.
Instead of importing the load, load_build functions into the osbuild
namespace and using it via that, use the load function via the module
that provides them, i.e. the formats.v1 module.
Add two simple tests to check that the osbuild executable fails with the
right exit codes when passed an invalid manifest or checkpoint.
This reuses test.OSBuild, which is extended to raise CalledProcessError
if needed.
Add a new method that will copy the downloaded data for a specified
source to a directory, creating a directory structure so that the
target directory can then be used to initialize the osbuild cache,
i.e. the target directory be used for the `cache_from` parameter in
the `test.OSBuild` constructor.
The main reason why this is done per source, and not for all sources,
is that not all source can be copied via a plain `cp` operation or
easily added to, in case that the `target` directory already contains
data. The `org.osbuild.ostree` source is such an example.
Previously, the osbuild executor had its internal temporary directory that
served as the output directory. However, this approach gives no power to
the caller to control the lifetime of the produced artifacts. When more
images are built using one executor, the results will accumulate in one
place possibly leading to exhaustion of disk space.
This commit removes the executor's internal output directory. The output
directory can now be passed to osbuild.compile, so the caller can control
its lifetime. If no directory is passed in, the compile method will use
its own temporary directory - this is useful in cases when the caller
doesn't care about the built artifacts or the manifest doesn't have any
outputs.
The `capture_output` argument for subprocess.run was added in 3.7,
but want to support 3.6 as well. Change all the usages of it with
`stdout=subprocess.PIPE` that will have the same effect, at least
for stdout.
Move `test_objectstore` into the module-level tests. This allows us to
run it as part of `make test-module.
Make sure to properly guard it as root-only module.
We want to extend our base-class to support extensions to
unittest.TestCase, so make sure we inherit from it.
Adjust all callers to no longer inherit from TestCase, since this is now
done automatically by TestBase.
Move the `tree-diff` tool into ./tools, which is our new place for tools
used by the test-suite or during development.
The only hard-coded user is the TestBase, so fix its path to the tool
so the test-suite will continue to find it.
Using `[]` as default value for arguments makes `pylint` complain. The
reason is that it creates an array statically at the time the function
is parsed, rather than dynamically on invocation of the function. This
means, when you append to this array, you change the global instance and
every further invocation of that function works on this modified array.
While our use-cases are safe, this is indeed a common pitfall. Lets
avoid using this and resort to `None` instead.
This silences a lot of warnings from pylint about "dangerous use of []".
The `tree-diff` tool currently requires access to our local checkout,
since we do not install the tool. Provide accessors in `TestBase` so we
do not hard-code the path everywhere.
Add a new OSBuild class to `./test/test.py`. This class is an extension
of `./test/osbuildtest.py`, but no longer requires the `output_id` and
`tree_id` identifiers of osbuild.
Furthermore, this new executor uses context-managers to make sure any
temporary object is only accessed for a contained time-frame.
Add a new base class called `TestBase` to our test-suite. This allows
sharing common code between our tests without requiring them to import
each other. Furthermore, it paves the way towards executing all our
tests as part of the `unittest` framework, including pylint and others.
For now, this adds the following features to `TestBase`:
* Common test-guards that are shared between our tests, like
`can_modify_immutable()` or `have_rpm_ostree()`.
* Accessors to the test-checkout. This is `have_test_checkout()` to
check whether the running test has a repository checkout, and
`locate_test_checkout()` to get a path to the repository checkout.
This will allow us to put pylint and friends into the unittest
framework, guard them properly, and still allow running the tests
from a global install which might not have access to a checkout.
For now, we always assume we run from a checkout.
* Accessors to test-data. If we start installing tests as a module
into the system, we cannot bundle test-data together with code.
Therefore, two accessors `have_test_data()` and `locate_test_data()`
are implemented to guard access to test data. If a checkout is
available, it will be used to locate test-data.
In the future, we want to be able to pass a separate path to the
test-data, thus allowing us to install tests into a system.