Repo Generation =============== Koji generates repositories based on tag content. For the most part, this means yum repos from the rpm content, but Koji can also generate maven repos if configured to do so. The *primary* purpose of these repos is to facilitate Koji's own build process. Most builds utilize a buildroot generated by the ``mock`` tool, which needs a yum repository to pull packages from. Repositories can be triggered in different ways and with different parameters, but all repositories represent the contents of a tag at a specific point in time (i.e. an event). On demand generation -------------------- When Koji needs a repo for a tag, it files a *request* via a hub call. Typically this is done in a build process, but requests can also be triggered automatically without a build if configured. They can also be triggered manually. :: repo.request(tag, min_event=None, at_event=None, opts=None, priority=None, force=False) description: Request a repo for a tag :param int|str taginfo: tag id or name :param int|str min_event: minimum event for the repo (optional) :param int at_event: specific event for the repo (optional) :param dict opts: custom repo options (optional) :param bool force: force request creation, even if a matching repo exists The special value min_event="last" uses the most recent event for the tag Otherwise min_event should be an integer use opts=None (the default) to get default options for the tag. If opts is given, it should be a dictionary of repo options. These will override the defaults. Each repo request is for a single tag. The optional ``min_event`` parameter specifies how recent the repo needs to be. If not given, Koji chooses a suitably recent event. The optional ``opts`` specifies options for creating the repo. If not given, Koji uses the default options based on the tag. When the hub responds to this call, it first checks to see if an existing repo satisfies the request. If so, then information for that repo is returned and no further action is taken. If there is no such repo yet, then Koji records the request and returns the request data. If an identical active request already exists, then Koji will return that. Build parameters ---------------- For some types of builds, the user can affect the parameters of the repo request. For rpms builds, the ``--wait-repo`` option will cause the build to request a *current* repo. That is, the ``min_event`` for the request will be the most recent event that affected the tag. For example, if a previous build has just been tagged into the buildroot, then this option will ensure that the new build gets a repo containing the previous one. It's worth noting that rpm builds also accept ``--wait-build`` option(s) that will cause the build to wait for specific NVRs to be present in the repo. This option is not actually handled by the request mechanism. Instead, the build will wait for these NVRs to be tagged and then request a current repo. Repository Options ------------------ There are a few options that govern how the repo is generated. At present these are: src whether to include srpms in the repos debuginfo whether to include debuginfo rpms separate_src whether to create a separate src repo maven whether to also create a maven repo These options are normally determined by the tag that the repo is based on. Administrators can set ``repo.opts`` for a given tag to control these options. Additionally the following pattern based hub options can be used: SourceTags Tags matching these glob patterns will have the src option set DebuginfoTags Tags matching these glob patterns will have the debuginfo option set SeparateSourceTags Tags matching these glob patterns will have the separate_src option set For historical reasons, the ``maven`` option can also controlled by setting the ``maven_support`` field for the tag. E.g. ``koji edit-tag --maven-support MYTAG`` Note that the ``maven`` option is ignored if Maven support is disabled on the hub. Manually requested repos can specify their own custom options. Automatic generation -------------------- Automatic generation can be configured setting ``repo.auto=True`` for a given tag. This requires administrative access. The system regularly requests repos for such tags. From Requests to Repos ---------------------- All repo requests go into a queue that Koji regularly checks. As long as there is sufficient capacity, Koji will create ``newRepo`` tasks for these requests. The status of a request can be checked with the ``repo.checkRequest`` api call :: repo.checkRequest(req_id) description: Report status of repo request :param int req_id the request id :return: status dictionary The return dictionary will include 'request' and 'repo' fields If the return includes a non-None ``repo`` field, then that repo satisfies the request. The ``request`` field will include ``task_id`` and ``task_state`` (may be None) to indicate progress. Repository Data --------------- The hub stores key data about each repo in the database and this can be reported numerous ways. One common way is the ``repoInfo`` call, which returns data about a single repository. E.g. :: $ koji call repoInfo 2398 {'begin_event': 497152, 'begin_ts': 1707888890.306149, 'create_event': 497378, 'create_ts': 1710216388.543129, 'creation_time': '2024-03-12 00:06:28.541893-04:00', 'creation_ts': 1710216388.541893, 'custom_opts': None, 'dist': False, 'end_event': None, 'end_ts': None, 'id': 2398, 'opts': {'debuginfo': False, 'separate_src': False, 'src': False}, 'state': 3, 'state_time': '2024-03-17 17:03:49.820435-04:00', 'state_ts': 1710709429.820435, 'tag_id': 2, 'tag_name': 'f24-build', 'task_id': 13611, 'task_state': 2} Key fields .. glossary:: id The integer id of the repo itself tag_id The integer id of the tag the repo was created from tag_name The name of the tag the repo was created from state The (integer) state of the repo. Corresponds to ``koji.REPO_STATES`` values create_event The event id (moment in koji history) that the repo was created from. I.e. the contents of the repo come from the contents of the tag at this event. create_ts This is the timestamp for the create_event. creation_ts / creation_time This is the time that the repo was created, which may be quite different than the time of the repo's create_event. The ``creation_ts`` field is the numeric value and ``creation_time`` is a string representation of that. state_ts / state_time This is the time that the repo last changed state. begin_event / end_event These events define the *range of validity* for the repo. Individual events do not necessarily affect a given tag, so for each repo there is actually a range of events where it accurately represents the tag contents. The ``begin_event`` is the first event in the range. This will often be the same as the create_event, but might not be. The ``end_event`` is the first event after creation that changes the tag. This is often None when a repo is created. Koji will update this field as tags change. begin_ts / end_ts These are the numeric timestamps for the begin and end events. opts This is dictionary of repo creation options custom_opts This dictionary indicates which options were overridden by the request task_id The numeric id of the task that created the repo dist A boolean flag. True for dist repos. Repository Lifecycle -------------------- Generally, the lifecycle looks like: :: INIT -> READY -> EXPIRED -> DELETED Repositories begin in the ``INIT`` state when the ``newRepo`` task first initializes them. Repos in this state are incomplete and not ready to be used. When Koji finishes creating a repo, it is moved to the ``READY`` state. Such repos are ready to be used. Their contents will remain unchanged until they are deleted. Note that this state does not mean the repo is current for its tag. When a repo is no longer relevant, Koji will move it to the ``EXPIRED`` state. This means the repo is marked for deletion and should no longer be used. Once a repo has been expired for a waiting period, Koji will move it to the ``DELETED`` state and remove its files from disc. The database entry will remain In cases of unusual errors, a repo might be moved to the ``PROBLEM`` state. Such repos should not be used and will eventually be deleted. Hub Configuration ----------------- There are several hub configuration option governing repo generation behavior: MaxRepoTasks The maximum number of ``newRepo`` tasks to run at one time. Default: ``10`` MaxRepoTasksMaven The maximum number of ``newRepo`` tasks for maven tags to run a one time. Default: ``2`` RepoRetries The number of times to retry a failed ``newRepo`` task per request. Default: ``3`` RequestCleanTime The number of minutes to wait before clearing an inactive repo request. Default: ``1440`` AllowNewRepo Whether to allow the legacy ``newRepo`` call. Default: ``True`` RepoLag This affects the default ``min_event`` value for normal repo requests. An event roughly this many seconds in the past is used. Default: ``3600`` RepoAutoLag Same as RepoLag, but for automatic requests. Default: ``7200`` RepoLagWindow This affects the granularity of the ``RepoLag`` and ``RepoAutoLag`` settings. Default: ``600`` RepoQueueUser The user that should own the ``newRepo`` tasks generated by repo requests. Default: ``kojira`` SourceTags Tags matching these glob patterns will have the src option set. Default: ``''`` DebuginfoTags Tags matching these glob patterns will have the debuginfo option set. Default: ``''`` SeparateSourceTags Tags matching these glob patterns will have the separate_src option set Default: ``''`` Repository Layout ----------------- Koji's repositories live under ``/mnt/koji/repos``. From there, they are indexed by tag name and repo id. So, the full path to a given repository would look something like :: /mnt/koji/repos/f40-build/6178041/ This directory will contain: * ``repo.json`` -- data about the repo itself * ``groups`` -- a directory containing comps data * ```` -- a directory for each tag arch containing a yum repo The full path to an actual yum repo would be something like: :: /mnt/koji/repos/f40-build/6178041/x86_64 This directory will contain: * ``pkglist`` -- file listing the relative paths to the rpms for the repo * ``blocklist`` -- file listing the blocked package names for the tag * ``rpmlist.jsonl`` -- json data for the rpms in the repo * ``toplink`` -- a relative symlink to the top of Koji's directory tree (i.e. up to /mnt/koji) * ``repodata`` -- yum repo data By default, source rpms are omitted. This can be controlled by repository options. If the ``src`` option is True, then source rpms will be added to each arch repo separately, similar to noarch rpms. If the ``separate_src`` option is True, then a separate ``src`` repo is created. Dist Repos ---------- Dist repos are managed by a separate process. See :doc:`exporting_repositories` for more details. Older Koji Versions ------------------- Prior to Koji 1.35, the triggering of repo generation was quite different. The kojira service monitored all build tags and trigger ``newRepo`` tasks whenever the tag content changed. The work queue was managed in kojira. For large systems, this could lead to significant regeneration backlogs.