116 lines
5.1 KiB
ReStructuredText
116 lines
5.1 KiB
ReStructuredText
=======================
|
|
Koji Content Generators
|
|
=======================
|
|
|
|
A Koji Content Generator is an external service that generates content
|
|
(jars, zips, tarballs, .npm, .wheel, .gem, etc) which is then passed to
|
|
Koji for management and delivery to other processes in the release
|
|
workflow. Content Generators can evolve independently of the Koji
|
|
codebase, enabling the build process to be more agile and flexible to
|
|
changing requirements and new technologies, while allowing Koji to
|
|
provide stable APIs and interfaces to other processes.
|
|
|
|
Along with the content to be managed by Koji, a Content Generator will
|
|
provide enough metadata to enable a reasonable level of auditing and
|
|
reproduceability. The exact data provided and the format used is being
|
|
discussed, but will include information like the upstream source URL,
|
|
build tools used, build environment contents, and any
|
|
container/virtualization technologies used.
|
|
|
|
The intention is that a team dedicated to managing a specific content
|
|
type will design and maintain their own Content Generator, in
|
|
coordination with the Koji developers. Once the Content Generator is
|
|
ready for production use it will be given permission to import content
|
|
and metadata it produces into Koji. Policies on the Koji hub will
|
|
validate imported content and metadata and ensure that it is complete
|
|
and consistent.
|
|
|
|
Requirements for writing a Content Generator
|
|
============================================
|
|
|
|
From an implementation perspective, content generators have wide
|
|
latitude in how they perform builds. To ensure sanity in the build
|
|
process, we strongly recommend that administrators of Koji systems set
|
|
policies about what content generators are allowed to do, and make sure
|
|
that those policies are followed before the content generator is granted
|
|
authorization in their Koji system.
|
|
|
|
Below are some examples of the sorts of policies that one might require.
|
|
Content Generators should be designed and implemented with the
|
|
requirements in mind. Please note that the list below is not complete.
|
|
|
|
Avoid Using the Host's Software
|
|
-------------------------------
|
|
|
|
During the building process, the code should avoid using the host's
|
|
installed software. The more reliance on installed software, the more
|
|
risk in the future that changes (such as upgrading a builder) will break
|
|
the build processes. Use mock chroots, VM guests, or containers wherever
|
|
possible to insulate against changes. Isolating the build environment
|
|
from the host environment makes reproducing work much easier and
|
|
predictable.
|
|
|
|
Source of build environment content
|
|
-----------------------------------
|
|
|
|
The build environment must come from somewhere. In a standard Koji
|
|
build, it comes from content already in Koji, or from configured
|
|
external repositories.
|
|
|
|
CG authors will likely want to pull content from sources outside of
|
|
Koji. Koji administrators should set a clear policy about which sources
|
|
are acceptable. The use of arbitrary sources can make it difficult or
|
|
impossible to reproduce build environments.
|
|
|
|
Binaries (or other compiled content) from Upstream May Not become included in output
|
|
------------------------------------------------------------------------------------
|
|
|
|
If tools or other content downloaded from external sources are used in
|
|
the build, they may not be included in CG build output, and may not be
|
|
imported into Koji. In other words, output must be built from sources in
|
|
the CG or Koji, not retrieved from the internet. Tools necessary to
|
|
build product content can be downloaded and cached in the CG.
|
|
|
|
Log all Transformations of Content
|
|
----------------------------------
|
|
|
|
When the content is building, as much should be logged as possible. In
|
|
addition to compilation, if the content goes through other
|
|
transformations, perhaps changing formats, that should be logged as
|
|
well. There can be no black-box transformations of the output. Imagine
|
|
having to figure out how a piece of content was built 5 years into the
|
|
future to understand the motivation behind this requirement. Details of
|
|
the build environment and tools used in the environment should be
|
|
recorded too.
|
|
|
|
Preserve All Inputs
|
|
-------------------
|
|
|
|
All inputs to a build task should be preserved either as logs, a
|
|
database, or as output of the build itself.
|
|
|
|
Preserve All Outputs
|
|
--------------------
|
|
|
|
Naturally the outputs of a build should be preserved too. Transient
|
|
artifacts are not strictly required, but if they're not onerous to
|
|
maintain, they should be included. It must not be necessary to further
|
|
transform the content to make it usable.
|
|
|
|
Do Not Use Caching Mechanisms
|
|
-----------------------------
|
|
|
|
Content Generators must build without caching mechanisms (in compilers
|
|
or DNF\ \|\ YUM) wherever possible. Caches make
|
|
reproducing results in the future more difficult, and also introduce
|
|
layers of indirection that can make debugging a build more difficult.
|
|
Consider the risk of re-shipping a security flaw that is compiled in
|
|
because an outdated library was cached in the Content Generator, this is
|
|
why we have this requirement.
|
|
|
|
Metadata
|
|
========
|
|
|
|
Metadata will be provided by the Content Generator as a JSON file. There
|
|
is a proposal of the :doc:`Content Generator
|
|
Metadata <content_generator_metadata>` format available for review.
|