deb-orchestrator/dev-architecture-docs/koji-overview.md
2025-08-18 23:45:01 -07:00

798 lines
30 KiB
Markdown

# Koji: A Comprehensive Analysis Report
## Executive Summary
**Koji** is Fedora's enterprise-grade RPM build system that provides a flexible, secure, and reproducible way to build software packages. It's a mature, production-ready system that has evolved over decades to handle large-scale package building with deep integration into Fedora's infrastructure.
This report provides a comprehensive analysis of Koji's architecture, design patterns, and implementation details based on source code examination and comparison with the deb-compose vision.
## What Koji Actually Does
### **Core Purpose**
Koji is fundamentally a **distributed build orchestration system** - it doesn't build packages directly, but rather coordinates the entire build process across multiple builder hosts. Think of it as the "air traffic controller" for package building, managing build requests, distributing work, and ensuring build consistency.
### **Primary Functions**
#### **1. Build Orchestration & Distribution**
- **Task Scheduling**: Distributes build tasks across available builder hosts
- **Build Environment Management**: Creates isolated buildroots for each build
- **Multi-Architecture Support**: Coordinates builds across different CPU architectures
- **Dependency Resolution**: Manages build dependencies and build order
#### **2. Build Infrastructure Management**
- **Builder Host Management**: Manages a pool of builder machines
- **Buildroot Creation**: Generates clean, reproducible build environments
- **Package Repository Integration**: Integrates with Yum/DNF repositories
- **Build Result Tracking**: Maintains complete audit trails of all builds
#### **3. Security & Access Control**
- **Authentication**: Supports multiple authentication methods (SSL, Kerberos, OIDC)
- **Authorization**: Granular permissions for different build operations
- **Build Isolation**: Each build runs in its own isolated environment
- **Audit Logging**: Complete logging of all build operations
## Technical Architecture
### **Multi-Tier Architecture**
Koji uses a **distributed client-server architecture** with clear separation of concerns:
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Koji CLI │ │ Web UI │ │ API │
│ Client │ │ │ │ Clients │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
└───────────────────┼───────────────────┘
┌─────────────┐
│ KojiHub │
│ (Server) │
└─────────────┘
┌───────────────────┼───────────────────┐
│ │ │
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Builder 1 │ │ Builder 2 │ │ Builder N │
│ (kojid) │ │ (kojid) │ │ (kojid) │
└─────────────┘ └─────────────┘ └─────────────┘
```
### **Core Components**
#### **1. KojiHub (`kojihub/`)**
The central server that orchestrates all build operations:
```python
# Core hub functionality in kojihub.py
class KojiHub:
def __init__(self):
self.db = DatabaseConnection()
self.scheduler = TaskScheduler()
self.auth = AuthenticationManager()
self.plugins = PluginManager()
def make_task(self, method, args, **opts):
"""Create and schedule a new build task"""
# Validate task parameters
# Check user permissions
# Create task record
# Schedule for execution
```
**Key Responsibilities**:
- **Task Management**: Creates, schedules, and tracks build tasks
- **Authentication**: Manages user sessions and permissions
- **Database Operations**: Maintains build state and metadata
- **Plugin System**: Extends functionality through plugins
#### **2. Builder Daemon (`builder/kojid`)**
The worker process that executes builds on builder hosts:
```python
class KojiBuilder:
def __init__(self):
self.buildroot_manager = BuildrootManager()
self.task_processor = TaskProcessor()
self.upload_manager = UploadManager()
def process_task(self, task_id):
"""Process an assigned build task"""
# Download task details
# Create buildroot
# Execute build
# Upload results
# Cleanup buildroot
```
**Key Responsibilities**:
- **Build Execution**: Runs actual build commands
- **Buildroot Management**: Creates and manages build environments
- **Result Upload**: Uploads build artifacts back to hub
- **Resource Management**: Manages local system resources
#### **3. CLI Client (`cli/koji`)**
The command-line interface for interacting with Koji:
```python
def main():
parser = argparse.ArgumentParser()
# ... argument parsing
# Load configuration
config = load_config()
# Create session
session = create_session(config)
# Execute command
result = execute_command(session, args)
```
**Key Responsibilities**:
- **Command Parsing**: Handles user commands and arguments
- **Session Management**: Manages authentication and sessions
- **Plugin Loading**: Loads and executes CLI plugins
- **Output Formatting**: Formats results for display
**Mock Integration Commands**:
```python
def handle_gen_mock_config(goptions, session, args):
"""Generate Mock configuration for Koji build target"""
# Parse build target and architecture
name, arch = args[0], args[1]
# Get build configuration
buildcfg = session.getBuildConfig(name)
# Generate Mock configuration
output = koji.genMockConfig(name, arch, **opts)
# Output to file or stdout
if options.ofile:
with open(options.ofile, 'wt') as fo:
fo.write(output)
else:
print(output)
```
**Available Mock Commands**:
- **`koji gen-mock-config`**: Generate Mock configuration for a build target
- **`koji list-mock-configs`**: List available Mock configurations
- **`koji clean-mock-env`**: Clean up Mock environments
### **Data Flow Architecture**
#### **1. Task Creation & Scheduling**
```python
def make_task(method, args, **opts):
"""Create and schedule a new build task"""
# Validate task parameters
task_id = create_task_record(method, args, opts)
# Check user permissions
check_task_permissions(method, args)
# Schedule task for execution
schedule_task(task_id, opts)
return task_id
```
#### **2. Task Distribution**
```python
def get_tasks_for_host(hostID, retry=True):
"""Get tasks assigned to a specific builder host"""
query = QueryProcessor(
columns=['task.id', 'task.state', 'task.method'],
tables=['task'],
clauses=['host_id = %(hostID)s', 'state=%(assigned)s'],
values={'hostID': hostID, 'assigned': TASK_STATES['ASSIGNED']},
opts={'order': 'priority,create_ts'},
)
return query.execute()
```
#### **3. Build Execution**
```python
def execute_build_task(task_id, buildroot_path):
"""Execute a build task in the specified buildroot"""
# Download source packages
download_sources(task_id)
# Install build dependencies
install_build_deps(buildroot_path)
# Execute build commands
result = run_build_commands(buildroot_path)
# Upload build results
upload_results(task_id, result)
return result
```
## Key Design Patterns & Philosophies
### **1. Task-Based Architecture**
Koji uses a **task-oriented design** where all operations are represented as tasks:
```python
class BaseTaskHandler:
"""Base class for all task handlers"""
def __init__(self, task_id, method, params):
self.task_id = task_id
self.method = method
self.params = params
def run(self):
"""Execute the task"""
raise NotImplementedError
def cleanup(self):
"""Clean up after task execution"""
pass
```
**Task Types**:
- **Build Tasks**: Package building operations
- **Image Tasks**: Image creation (Kiwi, OSBuild)
- **Repository Tasks**: Repository management
- **Admin Tasks**: System administration operations
### **2. Plugin-Based Extensibility**
Koji implements a sophisticated plugin system:
```python
class PluginTracker:
"""Manages plugin loading and execution"""
def __init__(self):
self.plugins = {}
self.handlers = {}
def load(self, name, path=None, reload=False):
"""Load a plugin from the specified path"""
if name in self.plugins and not reload:
return self.plugins[name]
# Load plugin module
plugin = self._load_module(name, path)
self.plugins[name] = plugin
# Register handlers
self._register_handlers(plugin)
return plugin
```
**Plugin Categories**:
- **Hub Plugins**: Extend server functionality
- **Builder Plugins**: Extend build process
- **CLI Plugins**: Extend command-line interface
### **3. Database-Centric Design**
Koji uses PostgreSQL as its primary data store:
```python
class QueryProcessor:
"""Database query processor with SQL injection protection"""
def __init__(self, columns, tables, clauses=None, values=None, opts=None):
self.columns = columns
self.tables = tables
self.clauses = clauses or []
self.values = values or {}
self.opts = opts or {}
def execute(self):
"""Execute the query and return results"""
sql = self._build_sql()
return self._execute_sql(sql, self.values)
```
**Database Schema**:
- **Task Management**: Build tasks and their states
- **Build Records**: Build metadata and results
- **User Management**: Users, permissions, and sessions
- **Host Management**: Builder hosts and their capabilities
### **4. Security-First Approach**
Koji implements comprehensive security measures:
```python
class Session:
"""Manages user authentication and authorization"""
def __init__(self, args=None, hostip=None):
self.logged_in = False
self.id = None
self.user_id = None
self.perms = None
def assertPerm(self, permission):
"""Assert that the user has a specific permission"""
if not self.hasPerm(permission):
raise koji.ActionNotAllowed(
'permission denied: %s' % permission
)
```
**Security Features**:
- **Session Management**: Secure session handling
- **Permission System**: Granular access control
- **Build Isolation**: Complete isolation between builds
- **Audit Logging**: Comprehensive operation logging
## Advanced Features
### **1. Multi-Architecture Support**
Koji handles complex multi-arch scenarios:
```python
def parse_arches(arches, to_list=False, strict=True, allow_none=False):
"""Parse architecture specifications"""
if arches is None:
if allow_none:
return [] if to_list else ""
raise koji.GenericError("No architectures specified")
if isinstance(arches, str):
arches = arches.split()
# Validate architectures
for arch in arches:
if arch not in koji.arch.arches:
if strict:
raise koji.GenericError("Unknown architecture: %s" % arch)
else:
logger.warning("Unknown architecture: %s" % arch)
return arches if to_list else " ".join(arches)
```
### **2. Buildroot Management**
Koji creates isolated build environments:
```python
def create_buildroot(build_id, arch, target_info):
"""Create a new buildroot for a build"""
# Create buildroot directory
buildroot_path = os.path.join(BUILDROOT_DIR, str(build_id))
os.makedirs(buildroot_path)
# Initialize package manager
package_manager = init_package_manager(arch)
# Install base packages
install_base_packages(package_manager, target_info)
# Install build dependencies
install_build_deps(package_manager, target_info)
return buildroot_path
```
### **3. Task Scheduling**
Koji implements sophisticated task scheduling:
```python
class TaskScheduler:
"""Manages task scheduling and distribution"""
def __init__(self):
self.hosts = {}
self.tasks = {}
def schedule_tasks(self):
"""Schedule pending tasks to available hosts"""
pending_tasks = self._get_pending_tasks()
available_hosts = self._get_available_hosts()
for task in pending_tasks:
host = self._find_best_host(task, available_hosts)
if host:
self._assign_task(task, host)
available_hosts[host]['capacity'] -= 1
```
### **4. Plugin System**
Koji's plugin architecture enables extensive customization:
```python
@export
def kiwiBuild(target, arches, desc_url, desc_path, **opts):
"""Kiwi image building plugin"""
# Check permissions
context.session.assertPerm('image')
# Validate parameters
validate_kiwi_params(desc_url, desc_path, opts)
# Create build task
task_id = kojihub.make_task('kiwiBuild',
[target, arches, desc_url, desc_path, opts],
channel='image')
return task_id
```
## Integration Points
### **The Koji-Mock Workflow**
Koji can work with Mock as an alternative build environment type:
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Koji │ │ Mock │ │ Build Process │
│ Orchestrator │ │ Environment │ │ Execution │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ 1. Generate Config │ │
│──────────────────────▶│ │
│ │ │
│ 2. Track Buildroot │ │
│◀──────────────────────│ │
│ │ │
│ │ 3. Execute Build │
│ │──────────────────────▶│
│ │ │
│ │ 4. Build Complete │
│ │◀──────────────────────│
│ │ │
│ 5. Cleanup │ │
│──────────────────────▶│ │
```
**Workflow Stages**:
1. **Koji Generates Config**: Koji creates Mock configuration files in `/etc/mock/koji/`
2. **Mock Environment Setup**: Mock uses the config to create isolated chroot environments
3. **Build Execution**: Mock executes builds within the isolated environment
4. **Result Collection**: Mock provides build results back to Koji
5. **Environment Cleanup**: Koji manages the lifecycle of Mock-based buildroots
**Why This Integration Works**:
- **Flexibility**: Koji can use Mock when chroot isolation is preferred over traditional buildroots
- **Repository Integration**: Mock configurations automatically use Koji's package repositories
- **Lifecycle Management**: Koji tracks and manages Mock environments alongside traditional buildroots
- **CLI Support**: Koji provides `koji gen-mock-config` command for easy Mock setup
### **1. Package Management Integration**
Koji integrates with RPM-based package management:
```python
def install_packages(packages, buildroot_path):
"""Install packages in the buildroot"""
# Configure package manager
dnf_config = create_dnf_config(buildroot_path)
# Install packages
cmd = ['dnf', '--config', dnf_config, 'install', '-y'] + packages
result = subprocess.run(cmd, cwd=buildroot_path, capture_output=True)
if result.returncode != 0:
raise koji.GenericError("Package installation failed: %s" % result.stderr)
```
### **2. Build Tool Integration**
Koji integrates with various build tools:
```python
def execute_build_command(buildroot_path, build_spec):
"""Execute the actual build command"""
if build_spec['type'] == 'rpm':
return execute_rpm_build(buildroot_path, build_spec)
elif build_spec['type'] == 'kiwi':
return execute_kiwi_build(buildroot_path, build_spec)
elif build_spec['type'] == 'osbuild':
return execute_osbuild(buildroot_path, build_spec)
else:
raise koji.GenericError("Unknown build type: %s" % build_spec['type'])
```
### **3. Mock Integration**
Koji has built-in support for Mock-based build environments:
```python
def genMockConfig(name, arch, managed=False, repoid=None, tag_name=None, **opts):
"""Generate a mock config for Koji-managed buildroots
Returns a string containing the config
The generated config is compatible with mock >= 0.8.7
"""
config_opts = {
'root': name,
'basedir': opts.get('mockdir', '/var/lib/mock'),
'target_arch': opts.get('target_arch', arch),
'chroothome': '/builddir',
'chroot_setup_cmd': 'install @%s' % opts.get('install_group', 'build'),
'rpmbuild_networking': opts.get('use_host_resolv', False),
'rpmbuild_timeout': opts.get('rpmbuild_timeout', 86400),
}
# Generate repository URLs for the Mock config
if repoid and tag_name:
pathinfo = PathInfo(topdir=opts.get('topdir', '/mnt/koji'))
repodir = pathinfo.repo(repoid, tag_name)
urls = ["file://%s/%s" % (repodir, arch)]
return generate_mock_config(config_opts, urls)
```
**How Koji Uses Mock**:
- **Configuration Generation**: Koji generates Mock configuration files in `/etc/mock/koji/`
- **Buildroot Management**: Koji tracks Mock-based buildroots by parsing config files
- **Repository Integration**: Koji configures Mock to use Koji's package repositories
- **Lifecycle Management**: Koji manages the creation, monitoring, and cleanup of Mock environments
### **4. Repository Integration**
Koji manages package repositories:
```python
def create_repository(tag_info, build_target):
"""Create a package repository for a build target"""
# Generate repository metadata
metadata = generate_repo_metadata(tag_info)
# Create repository structure
repo_path = create_repo_structure(build_target)
# Add packages to repository
add_packages_to_repo(repo_path, tag_info['packages'])
# Generate repository indexes
generate_repo_indexes(repo_path)
return repo_path
```
## Performance Characteristics
### **1. Scalability**
Koji is designed for large-scale operations:
- **Horizontal Scaling**: Can distribute builds across hundreds of builder hosts
- **Load Balancing**: Intelligent task distribution based on host capabilities
- **Parallel Execution**: Multiple builds can run simultaneously
- **Resource Management**: Efficient use of builder resources
### **2. Resource Usage**
Koji manages resources carefully:
- **Buildroot Isolation**: Each build runs in its own environment
- **Memory Management**: Controlled memory usage during builds
- **Disk Space**: Efficient use of disk space with cleanup procedures
- **Network Optimization**: Optimized file transfers and uploads
### **3. Monitoring & Observability**
Comprehensive monitoring capabilities:
```python
def log_build_metrics(build_id, metrics):
"""Log build performance metrics"""
insert = InsertProcessor(
'build_metrics',
data={
'build_id': build_id,
'start_time': metrics['start_time'],
'end_time': metrics['end_time'],
'duration': metrics['duration'],
'memory_peak': metrics['memory_peak'],
'disk_usage': metrics['disk_usage']
}
)
insert.execute()
```
## Comparison with deb-compose Vision
### **Similarities**
- **Distributed architecture**: Both use distributed systems for scalability
- **Task-based design**: Both organize work into discrete tasks
- **Plugin system**: Both support extensibility through plugins
- **Build isolation**: Both ensure builds run in isolated environments
### **Key Differences**
- **Package Management**: Koji uses RPM, deb-compose uses DEB
- **Build Focus**: Koji focuses on package building, deb-compose on image composition
- **Architecture**: Koji is client-server, deb-compose is more monolithic
- **Integration**: Koji has deeper integration with build tools
### **Relationship with Pungi and Mock**
Koji serves as the foundation for Fedora's build ecosystem:
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Pungi │ │ Koji │ │ Mock │
│ Orchestrator │ │ Build System │ │ Build Environment│
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ 1. Request Packages │ │
│──────────────────────▶│ │
│ │ │
│ 2. Packages Ready │ │
│◀──────────────────────│ │
│ │ │
│ │ 3. Use Mock Config │
│ │◀──────────────────────│
│ │ │
│ 4. Build Complete │ │
│◀──────────────────────│ │
```
**Koji's Role**:
- **Package Foundation**: Provides pre-built RPM packages for Pungi to compose
- **Mock Integration**: Generates Mock configurations that Pungi can use
- **Build Coordination**: Manages build tasks and buildroot lifecycle
- **Repository Management**: Maintains package repositories for Mock environments
## Lessons for deb-compose
### **1. Architecture Strengths to Emulate**
- **Task-based design**: Clear separation of build operations
- **Plugin system**: Extensibility without core changes
- **Build isolation**: Complete isolation between builds
- **Scalability**: Distributed architecture for growth
### **2. Complexity to Avoid Initially**
- **Multi-tier architecture**: Start with simpler client-server model
- **Complex scheduling**: Begin with basic task distribution
- **Advanced plugins**: Focus on core functionality first
- **Enterprise features**: Implement basic features before advanced ones
### **3. Implementation Priorities**
- **Core build system**: Focus on basic build orchestration
- **Simple task management**: Basic task creation and execution
- **Build isolation**: Ensure builds don't interfere with each other
- **Plugin framework**: Simple plugin system for extensibility
## Technical Implementation Details
### **Entry Point Architecture**
Koji's main entry point demonstrates its distributed approach:
```python
def main():
# Parse command line arguments
parser = create_argument_parser()
args = parser.parse_args()
# Load configuration
config = load_config(args.config)
# Create session
session = create_session(config, args)
# Execute command
if args.command == 'build':
result = execute_build_command(session, args)
elif args.command == 'list-tasks':
result = execute_list_tasks_command(session, args)
# ... more commands
```
### **Task Management System**
Koji's task system is highly sophisticated:
```python
class TaskManager:
def __init__(self):
self.tasks = {}
self.hosts = {}
def create_task(self, method, args, **opts):
"""Create a new task"""
task_id = self._generate_task_id()
task = {
'id': task_id,
'method': method,
'args': args,
'state': 'FREE',
'priority': opts.get('priority', 0),
'arch': opts.get('arch'),
'channel': opts.get('channel', 'default')
}
self.tasks[task_id] = task
self._schedule_task(task_id)
return task_id
```
### **Buildroot Management**
Koji's buildroot system ensures build isolation:
```python
def create_buildroot(build_id, arch, target_info):
"""Create an isolated build environment"""
# Create buildroot directory
buildroot_path = os.path.join(BUILDROOT_DIR, str(build_id))
os.makedirs(buildroot_path)
# Mount necessary filesystems
mount_buildroot_filesystems(buildroot_path)
# Initialize package manager
package_manager = init_package_manager(buildroot_path, arch)
# Install base system
install_base_system(package_manager, target_info)
# Install build dependencies
install_build_dependencies(package_manager, target_info)
return buildroot_path
```
### **Mock Buildroot Integration**
Koji can also manage Mock-based buildroots:
```python
def _scanLocalBuildroots(self):
"""Scan for Mock-based buildroots managed by Koji"""
configdir = '/etc/mock/koji'
buildroots = {}
for f in os.listdir(configdir):
if not f.endswith('.cfg'):
continue
# Parse Mock config files to find Koji buildroot IDs
with open(os.path.join(configdir, f)) as fo:
for line in fo:
if line.startswith('# Koji buildroot id:'):
buildroot_id = int(line.split(':')[1])
elif line.startswith('# Koji buildroot name:'):
buildroot_name = line.split(':')[1].strip()
if buildroot_id and buildroot_name:
buildroots[buildroot_id] = {
'name': buildroot_name,
'cfg': os.path.join(configdir, f),
'dir': os.path.join(self.options.mockdir, buildroot_name)
}
return buildroots
```
**Mock Buildroot Features**:
- **Configuration Tracking**: Koji tracks Mock buildroots through config file parsing
- **Lifecycle Management**: Koji manages Mock environment creation, monitoring, and cleanup
- **Repository Integration**: Mock configs automatically use Koji's package repositories
- **Buildroot State**: Koji maintains buildroot state information for Mock environments
## Production Readiness Features
### **1. Authentication & Security**
- **Multiple Auth Methods**: SSL, Kerberos, OIDC support
- **Session Management**: Secure session handling with timeouts
- **Permission System**: Granular permissions for all operations
- **Audit Logging**: Complete audit trail of all operations
### **2. Monitoring & Alerting**
- **Task Monitoring**: Real-time task status monitoring
- **Host Monitoring**: Builder host health monitoring
- **Performance Metrics**: Build performance tracking
- **Failure Alerting**: Immediate alerts on build failures
### **3. Recovery & Resilience**
- **Task Retry**: Automatic retry for failed tasks
- **Host Failover**: Automatic failover to healthy hosts
- **State Persistence**: Maintains state across restarts
- **Cleanup Procedures**: Automatic cleanup of failed builds
## Conclusion
Koji represents a **mature, enterprise-grade build system** that has evolved over decades to handle Fedora's massive scale. Its key insight is that **build orchestration is more valuable than build execution** - by coordinating build processes across multiple hosts rather than building everything locally, it achieves scalability and reliability.
For deb-compose, the lesson is clear: **focus on being an excellent build orchestrator** rather than trying to implement everything. Koji's success comes from its ability to coordinate complex build workflows while delegating actual build execution to specialized builder hosts. This architecture allows it to handle massive scale while remaining maintainable and extensible.
The roadmap's approach of building incrementally with clear phases aligns well with Koji's proven architecture. By starting with core build orchestration and gradually adding complexity, deb-compose can achieve similar reliability without the initial complexity that Koji has accumulated over years of production use.
### **Key Takeaways for deb-compose Development**
1. **Start Simple**: Begin with basic build orchestration rather than complex features
2. **Delegate Wisely**: Focus on coordination, not implementation
3. **Isolate Builds**: Ensure complete isolation between build environments
4. **Grow Incrementally**: Add complexity only when needed
5. **Learn from Koji**: Study Koji's patterns but avoid its complexity initially
This analysis provides a solid foundation for understanding how to build a successful build orchestration system while avoiding the pitfalls of over-engineering early in development.