Instead of always parsing the python stage to load meta information
allow the user of a new `{stage}-meta.json` file. This is a first
step towards allowing modules to be written in a different language
than python. It also has some practical advantages:
- slightly faster as it avoids calling python to output the schemas
- easier to write schemas as this can be done in a real json editor
now
- more extensible in a future where stages maybe binaries with
shlib dependencies that are only satisfied in the buildroot
but not on the host
Add comment why the `ModuleInfo.load()` code uses open()/ast.parse()
instead of just using `importlib`.
The reason is that while `importlib` is more convenient and much
shorter it would require that all python modules of the osbuild
modules are actually installed on the system just to inspect the
schema/documentation of the stage.
The way that runners were designed is the following: For each distro
we have a specific runner. In case a new version of the distro can
use the previous runner, we just create a symlink. In case a new
distro version needs adjustments, the runner is copied and adjusted.
This is a very clean and obvious design. There is one big drawback:
For each new distribution a symlink must be created before it can be
used. For Fedora that should ideally happen when it is branched; and
this will, ipso facto, always be a symlink since at the time of the
branching the new distro is the old distro. But at this very moment
osbuild will be broken since it does not contain the new runner; the
only way to prevent this is to create the corresponding new runner
before the distro is branched, where it then must be a symlink too.
This very much suggest that instead of the explicit symlink, which
does not /that/ much clarity, the existing "old" runner should just
work for the new distribution. This commit implements the logic to
do just that: all existing runners are parsed into a distro and
version tuple and then, given a specific requested distro, the best
matching one is return.
We need to initialize `schema` to `None`, otherwise it will be an access
to an uninitialized variable when looking up invalid schemata:
[...]
File "[...]/osbuild/meta.py", line 583, in get_schema
schema = Schema(schema, name or klass)
UnboundLocalError: local variable 'schema' referenced before assignment
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
The schema input of Schema.__init__ is a python-native representation
of a JSON object, so it can be any kind of dictionary. Furthermore, it
is optional.
Fix the type to be Optional[Dict].
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Show the stage name (if one is set) when failing the stage in the
validator. This closes#1007, example output:
```
€ python3 -m osbuild supakeen-os.json
supakeen-os.json has errors:
pipelines[0].stages[0]
could not find schema information for 'org.osbuild.rpmb'
.pipelines[0].stages[0].inputs.packages:
could not find schema information for 'org.osbuild.filesz'
```
Add new stage metadata `CAPABILITIES` where stages can request
additional capabilities that are not in the default set.
Currently this is not used by any stage since the default set
contains the sum of all needed capabilities.
If a stage has not itself defined the `mounts` property, allow any
mounts. This is in preparation to support specialized mounts, such
as bind mounts or ostree deployment mounts to transparently work
with any stage.
NB: devices are not allowed so this will not be applicable for the
current filesystem mounts.
Define the mount schema in the actual mounts at a higher level. This
is in preparation to give the modules more control over the `source`
and `target` properties.
Allows stages to access file systems provided by devices.
This makes mount handling transparent to the stages, i.e.
the individual stages do not need any code for different
file system types and the underlying devices.
A new host service that provides device functionality to stages.
Since stages run in a container and are restricted from creating
device nodes, all device handling is done in the main osbuild
process. Currently this is done with the help of APIs and RPC,
e.g. `LoopServer`. Device host services on the other hand allow
declaring devices in the manifest itself and then osbuild will
prepare all devices before running the stage. One desired effect
is that it makes device handling transparent to the stages, e.g.
they don't have to know about loopback devices, LVM or LUKS.
Another result is that specific device handling is now modular
like Inputs and Source are and thus moved out of osbuild itself.
For schema version 2 of modules, the `definitions` node, as defined in
the module itself, won't be at the `options` level but at the level of
the `properties` node. Look for a `definitions` at that `properties`
level and move it to the top, if found.
When parsing the module file, parse the JSON directly from the AST
node, because the AST node contains the line number of the schema
in the module and thus we can resolve the correct line number for
errors within the JSON. Convert the `JSONDecodeError` to a
`SyntaxError` which results in an overall better exception message:
Before:
Traceback (most recent call last):
File "/workspaces/osbuild/osbuild/meta.py", line 331, in get_schema
opts = self._make_options(version)
[...]
File "/usr/lib64/python3.9/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in
double quotes: line 2 column 1 (char 14)
After:
Traceback (most recent call last):
File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
[...]
raise SyntaxError(msg, detail) from None
File "stages/org.osbuild.ostree.init-fs", line 31
additionalProperties: False
^
SyntaxError: Invalid schema: Expecting property name enclosed in ...
Define the mapping of modules and their paths at the `ModuleInfo` class
level instead of having it inline in a function. This makes it possible
to use it from other places in the code.
Add a `version` keyword argument to `Index.get_schema` which
will in turn look for `osbuild<version>.json` in case of the
schema for the manifest is requested and otherwise forward
the version argument to the `get_schema` method for the
respective `ModuleInfo`.
When loading the schema information via the source code of a
module, look for a `SCHEMA_2` global variable, representing
the schema version 2. Extend the `get_schema` method so in
takes a `version` keyword argument. Rework the code so that
if version 2 for the format is specified but no dedicated
schema data is found, a fallback based on the version 1 is
provided. This makes it easy to use all existing stages
without explicitly duplicating all schema information.
NB: The code is not very pretty, the hope is that in the
future, the module, being an executable, could be called
with a command line switch, a la `--schema <version>` and
this would return the schema data. So that hackery code
we currently have will hopefully vanish soon. I am sorry
though for this mess.
Change the `ModuleInfo.schema` propertly into a `get_schema`
method call. This is in preparation to allow for different
schemata versions to be supported.
Introdcue a `FormatInfo` class that, very much like `ModuleInfo`
can be used to obtain meta information about a format. Methods
are added to `Index` to allow the enumeration of available formats,
getting the `FormatInfo` for a format given its name and to detect
a format via the manifest description data.
Change the top-level documentation to reflect the changes. Also
remove an outdated section about validation of the schema; this
was moved to the format specific code some time ago.
Inputs are modules like Stages, Assemblers and Sources. Add them
as a new module klass to the various functions. Include them in
the schema test, so the schema of all inputs is validated.
Also sort the module classes alphabetically in the class mapping
and class list.
Add the path to the executable for that module to the ModuleInfo.
This can then later be used to actually execute said module. The
information is already readily available since we used the path
to load the information from the file in the first place.
The validation of the manifest descritpion is eo ipso format
specific and thus belongs into the format specific module.
Adapt all usages throughout the codebase to directly use the
version 1 specific function.
Add support for querying information about sources: add the mapping
from name to directory and accept "Source" as a module name. Adapt
the ModuleInfo schema property to handle the different styles for
stage-like schemata as well as sources now.
For all currently supported modules, i.e. stages and assemblers,
convert the STAGE_DESC and STAGE_INFO into a proper doc-string.
Rename the STAGE_OPTS into SCHEMA.
Refactor meta.ModuleInfo loading accordingly.
The script to be used for the conversion is:
--- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< --- 8< ---
import os
import sys
import osbuild
import osbuild.meta
from osbuild.meta import ModuleInfo
def find_line(lines, start):
for i, l in enumerate(lines):
if l.startswith(start):
return i
return None
def del_block(lines, prefix):
start = find_line(lines, prefix)
end = find_line(lines[start:], '"""')
print(start, end)
del lines[start:start+end+1]
def main():
index = osbuild.meta.Index(os.curdir)
modules = []
for klass in ("Stage", "Assembler"):
mods = index.list_modules_for_class(klass)
modules += [(klass, module) for module in mods]
for m in modules:
print(m)
klass, name = m
info = ModuleInfo.load(os.curdir, klass, name)
module_path = ModuleInfo.module_class_to_directory(klass)
path = os.path.join(os.curdir, module_path, name)
with open(path, "r") as f:
data = list(f.readlines())
i = find_line(data, "STAGE_DESC")
print(i)
del data[i]
del_block(data, "STAGE_INFO")
i = find_line(data, "STAGE_OPTS")
data[i] = 'SCHEMA = """\n'
docstr = '"""\n' + info.desc + "\n" + info.info + '"""\n'
doclst = docstr.split("\n")
doclst = [l + "\n" for l in doclst]
data = [data[0]] + doclst + data[1:]
with open(path, "w") as f:
f.writelines(data)
if __name__ == "__main__":
main()
Make the mapping of module class to the corresponding directory
a method of the ModuleInfo class. This is so it can be re-used
by others in the future.
The are converging on a nomenclature where the sum of Stages,
Assemblers, Sources (and future entities like those) together
are called 'Modules'.
Thus rename StageInfo to ModuleInfo and the corresponding
variables and methods.
Using `[]` as default value for arguments makes `pylint` complain. The
reason is that it creates an array statically at the time the function
is parsed, rather than dynamically on invocation of the function. This
means, when you append to this array, you change the global instance and
every further invocation of that function works on this modified array.
While our use-cases are safe, this is indeed a common pitfall. Lets
avoid using this and resort to `None` instead.
This silences a lot of warnings from pylint about "dangerous use of []".
We currently don't seem to use anything that requires us to use
the draft 7 of the specification. The minimum version that we
need is draft 4, which is also supported by the python-jsonschema
version in RHEL 8.2 (which is 2.6.0).
The truthiness of the `Schema` object itself now contains the
schema validation as well, i.e. schema is only valid if schema
information is present and said information passes validation.
The _validator member of `Schema` is used as an indicator whether
the provided schema is valid. The `check` method will, in case
that _validator is not set attempt to validate the schema data,
if present and set the _validator member if schema data is set and
validation has passed. On failure, i.e. missing schema information
or invalid schema data, the ValidationResult will contain the
respective error.
This new module contains utilities that help to introspect parts
that constitute the inner parts of osbuild, i.e. its stages
and assembler (which is also considered a type of stage in
this context). It contains the `StageInfo` class that can that
contains meta-information about the individual stage, such as
a short information (`info`), a longer description (`desc`) and
its JSON schema. A new Schema class represents schema data and
has a `validation` method that can be used to validate that json
data conforms to said schema.
A `Index` class can be used to obtain `StageInfo` and `Schema`
for entities identified via `klass` and `name`.
A top level `validate` method is introduced that can validate
manifest data.
Internally it uses the `jsonschema` package so add that as a
requirement and Install this dependency in the CI.