deb-osbuild/docs/osbuild.md
robojerk 544eb61951 docs: Reorganize documentation into proper docs/ directory structure
- Moved all documentation files to docs/ directory for better organization
- Maintained all existing documentation content
- Improved project structure for better maintainability
- Documentation now follows standard open source project layout
2025-08-12 00:27:41 -07:00

1388 lines
38 KiB
Markdown

# osbuild Comprehensive Top-to-Bottom Analysis
## Overview
osbuild is a pipeline-based build system for operating system artifacts that defines a universal pipeline description and execution engine. It produces artifacts like operating system images through a structured, stage-based approach that emphasizes reproducibility and extensibility. This document provides a complete top-to-bottom analysis of the entire osbuild process, from manifest parsing to final artifact generation.
## Table of Contents
1. [Complete Process Flow](#complete-process-flow)
2. [Core Architecture](#core-architecture)
3. [Pipeline System](#pipeline-system)
4. [Stage System](#stage-system)
5. [Assembler System](#assembler-system)
6. [Build Execution Engine](#build-execution-engine)
7. [Object Store and Caching](#object-store-and-caching)
8. [Security and Isolation](#security-and-isolation)
9. [External Tools and Dependencies](#external-tools-and-dependencies)
10. [Manifest Processing](#manifest-processing)
11. [Integration Points](#integration-points)
12. [Complete Workflow Examples](#complete-workflow-examples)
## Complete Process Flow
### End-to-End Workflow
```
1. Manifest Loading → 2. Schema Validation → 3. Pipeline Construction → 4. Stage Execution → 5. Object Storage → 6. Assembly → 7. Artifact Generation → 8. Cleanup
```
### Detailed Process Steps
#### **Phase 1: Manifest Processing**
1. **File Loading**: Read JSON manifest file from disk or stdin
2. **Schema Validation**: Validate against JSON schemas for stages and assemblers
3. **Pipeline Construction**: Build stage dependency graph and execution order
4. **Source Resolution**: Download and prepare input sources
5. **Configuration**: Set up build environment and options
#### **Phase 2: Build Environment Setup**
1. **BuildRoot Creation**: Initialize isolated build environment
2. **Resource Allocation**: Set up temporary directories and mounts
3. **Capability Management**: Configure process capabilities and security
4. **API Registration**: Set up communication endpoints
5. **Device Management**: Configure loop devices and partitions
#### **Phase 3: Stage Execution**
1. **Dependency Resolution**: Execute stages in dependency order
2. **Input Processing**: Map input objects to stage requirements
3. **Environment Setup**: Mount filesystems and configure devices
4. **Stage Execution**: Run stage scripts in isolated environment
5. **Output Collection**: Store stage results in object store
#### **Phase 4: Assembly and Output**
1. **Object Collection**: Gather all stage outputs
2. **Assembly Execution**: Run assembler to create final artifact
3. **Format Conversion**: Convert to requested output format
4. **Artifact Generation**: Create final image or archive
5. **Cleanup**: Remove temporary files and mounts
### Process Architecture Overview
```
osbuild CLI → Manifest Parser → Pipeline Builder → Stage Executor → Object Store → Assembler → Final Artifact
↓ ↓ ↓ ↓ ↓ ↓ ↓
Main Entry JSON Schema Dependency Graph Stage Runner Cache Output Gen Image/Archive
```
## Core Architecture
### Design Philosophy
osbuild follows a **declarative pipeline architecture** where:
- **Manifests** define the complete build process as JSON
- **Stages** are atomic, composable building blocks
- **Assemblers** create final artifacts from stage outputs
- **Pipelines** orchestrate stage execution and data flow
### Key Components
```
osbuild CLI → Manifest Parser → Pipeline Executor → Stage Runner → Assembler → Artifacts
↓ ↓ ↓ ↓ ↓ ↓
Main Entry JSON Schema Pipeline Builder Stage Exec Output Gen Final Files
```
### Architecture Principles
1. **Stages are never broken, only deprecated** - Same manifest always produces same output
2. **Explicit over implicit** - No reliance on tree state
3. **Pipeline independence** - Tree is empty at beginning of each pipeline
4. **Machine-generated manifests** - No convenience functions for manual creation
5. **Confined build environment** - Security against accidental misuse
6. **Distribution compatibility** - Python 3.6+ support
## Pipeline System
### Pipeline Structure
A pipeline is a directed acyclic graph (DAG) of stages:
```json
{
"pipeline": {
"build": {
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": { ... }
},
{
"name": "org.osbuild.apt",
"options": { ... }
}
]
},
"assembler": {
"name": "org.osbuild.qemu",
"options": { ... }
}
}
}
```
### Pipeline Execution Model
```python
class Pipeline:
def __init__(self, info, source_options, build, base, options, source_epoch):
self.info = info
self.sources = source_options
self.build = build
self.base = base
self.options = options
self.source_epoch = source_epoch
self.checkpoint = False
self.inputs = {}
self.devices = {}
self.mounts = {}
```
### Pipeline Lifecycle
1. **Initialization**: Load manifest and validate schema
2. **Preparation**: Set up build environment and dependencies
3. **Execution**: Run stages in dependency order
4. **Assembly**: Create final artifacts from stage outputs
5. **Cleanup**: Remove temporary files and resources
## Stage System
### Stage Architecture
Stages are the core building blocks of osbuild:
```python
class Stage:
def __init__(self, info, source_options, build, base, options, source_epoch):
self.info = info # Stage metadata
self.sources = source_options # Input sources
self.build = build # Build configuration
self.base = base # Base tree
self.options = options # Stage-specific options
self.source_epoch = source_epoch # Source timestamp
self.checkpoint = False # Checkpoint flag
self.inputs = {} # Input objects
self.devices = {} # Device configurations
self.mounts = {} # Mount configurations
```
### Stage Types
#### **System Construction Stages**
- `org.osbuild.debian.debootstrap`: Debian base system creation
- `org.osbuild.rpm`: RPM package installation
- `org.osbuild.ostree`: OSTree repository management
- `org.osbuild.apt`: Debian package management
#### **Filesystem Stages**
- `org.osbuild.overlay`: File/directory copying
- `org.osbuild.mkdir`: Directory creation
- `org.osbuild.copy`: File copying operations
- `org.osbuild.symlink`: Symbolic link creation
#### **Configuration Stages**
- `org.osbuild.users`: User account management
- `org.osbuild.fstab`: Filesystem table configuration
- `org.osbuild.locale`: Locale configuration
- `org.osbuild.timezone`: Timezone setup
#### **Bootloader Stages**
- `org.osbuild.grub2`: GRUB2 bootloader configuration
- `org.osbuild.bootupd`: Modern bootloader management
- `org.osbuild.zipl`: S390x bootloader
#### **Image Creation Stages**
- `org.osbuild.image`: Raw image creation
- `org.osbuild.qemu`: QEMU image assembly
- `org.osbuild.tar`: Archive creation
- `org.osbuild.oci-archive`: OCI container images
### Stage Implementation Pattern
Each stage follows a consistent pattern:
```python
#!/usr/bin/python3
import os
import sys
import osbuild.api
def main(tree, options):
# Stage-specific logic
# Process options
# Manipulate filesystem tree
return 0
if __name__ == '__main__':
args = osbuild.api.arguments()
ret = main(args["tree"], args["options"])
sys.exit(ret)
```
### Key Stages Deep Dive
#### **GRUB2 Stage** (`stages/org.osbuild.grub2.iso`)
**Purpose**: Configure GRUB2 bootloader for ISO images
**Implementation**: Template-based GRUB configuration generation
```python
GRUB2_EFI_CFG_TEMPLATE = """$defaultentry
function load_video {
insmod efi_gop
insmod efi_uga
insmod video_bochs
insmod video_cirrus
insmod all_video
}
load_video
set gfxpayload=keep
insmod gzio
insmod part_gpt
insmod ext2
set timeout=${timeout}
search --no-floppy --set=root -l '${isolabel}'
menuentry 'Install ${product} ${version}' --class fedora --class gnu-linux --class gnu --class os {
linux ${kernelpath} ${root} quiet
initrd ${initrdpath}
}
"""
```
**External Tools Used**:
- `shim*.efi`: Secure boot components
- `grub*.efi`: GRUB bootloader binaries
- `unicode.pf2`: GRUB font files
#### **bootupd Stage** (`stages/org.osbuild.bootupd.gen-metadata`)
**Purpose**: Generate bootupd update metadata
**Implementation**: Chroot execution of bootupctl
```python
def main(tree):
with MountGuard() as mounter:
# Mount essential directories
for source in ("/dev", "/sys", "/proc"):
target = os.path.join(tree, source.lstrip("/"))
os.makedirs(target, exist_ok=True)
mounter.mount(source, target, permissions=MountPermissions.READ_ONLY)
# Execute bootupctl in chroot
cmd = ['chroot', tree, '/usr/bin/bootupctl', 'backend', 'generate-update-metadata', '/']
subprocess.run(cmd, check=True)
return 0
```
**External Tools Used**:
- `chroot`: Filesystem isolation
- `bootupctl`: bootupd control utility
- `mount`: Directory mounting
#### **QEMU Assembler** (`assemblers/org.osbuild.qemu`)
**Purpose**: Create bootable disk images
**Implementation**: Comprehensive disk image creation with bootloader support
```python
def main(tree, options):
# Create image file
# Partition using sfdisk
# Format filesystems
# Copy tree contents
# Install bootloader
# Convert to requested format
```
**External Tools Used**:
- `truncate`: File size management
- `sfdisk`: Partition table creation
- `mkfs.ext4`, `mkfs.xfs`: Filesystem formatting
- `mount`, `umount`: Partition mounting
- `grub2-mkimage`: GRUB image creation
- `qemu-img`: Image format conversion
## Assembler System
### Assembler Types
#### **Image Assemblers**
- `org.osbuild.qemu`: Bootable disk images (raw, qcow2, vmdk, etc.)
- `org.osbuild.rawfs`: Raw filesystem images
- `org.osbuild.tar`: Compressed archives
- `org.osbuild.oci-archive`: OCI container images
#### **Specialized Assemblers**
- `org.osbuild.ostree.commit`: OSTree repository commits
- `org.osbuild.error`: Error reporting
- `org.osbuild.noop`: No-operation assembler
### Assembler Implementation
```python
class QemuAssembler:
def __init__(self, options):
self.options = options
self.format = options["format"]
self.filename = options["filename"]
self.size = options["size"]
self.bootloader = options.get("bootloader", {})
def assemble(self, tree):
# Create image file
# Set up partitions
# Copy filesystem contents
# Install bootloader
# Convert to target format
return result
```
## External Tools and Dependencies
### Core System Tools
#### **Filesystem Management**
- `parted`: Partition table management
- `sfdisk`: Scriptable partitioning
- `mkfs.ext4`, `mkfs.xfs`, `mkfs.fat`: Filesystem creation
- `mount`, `umount`: Filesystem mounting
- `losetup`: Loop device management
- `truncate`: File size manipulation
#### **Package Management**
- `rpm`: RPM package management
- `yum`, `dnf`: Package manager frontends
- `apt`, `apt-get`: Debian package management
- `pacman`: Arch Linux package management
#### **Bootloader Tools**
- `grub2-install`: GRUB2 installation
- `grub2-mkimage`: GRUB2 image creation
- `bootupctl`: bootupd control utility
- `zipl`: S390x bootloader
#### **Image and Archive Tools**
- `qemu-img`: Image format conversion
- `tar`: Archive creation and extraction
- `gzip`, `bzip2`, `xz`: Compression
- `skopeo`: Container image operations
#### **System Tools**
- `bubblewrap`: Process isolation
- `systemd-nspawn`: Container management
- `chroot`: Filesystem isolation
- `curl`: Network file transfer
### Build System Dependencies
#### **Python Dependencies** (`requirements.txt`)
```
jsonschema
```
#### **System Dependencies**
- `python >= 3.6`: Python runtime
- `bubblewrap >= 0.4.0`: Process isolation
- `bash >= 5.0`: Shell execution
- `coreutils >= 8.31`: Core utilities
- `curl >= 7.68`: Network operations
- `qemu-img >= 4.2.0`: Image manipulation
- `rpm >= 4.15`: RPM package management
- `tar >= 1.32`: Archive operations
- `util-linux >= 235`: System utilities
- `skopeo`: Container operations
- `python3-librepo`: Repository access
### External Tool Integration Points
#### **Command Execution**
```python
def run_command(cmd, cwd=None, env=None):
"""Execute external command with proper environment"""
result = subprocess.run(
cmd,
cwd=cwd,
env=env,
capture_output=True,
text=True,
check=True
)
return result
```
#### **Chroot Execution**
```python
def chroot_execute(tree, cmd):
"""Execute command in chroot environment"""
chroot_cmd = ['chroot', tree] + cmd
return run_command(chroot_cmd)
```
#### **Mount Management**
```python
class MountGuard:
"""Context manager for mount operations"""
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.cleanup()
def mount(self, source, target, permissions=None):
# Mount with proper permissions
pass
```
## Manifest Processing
### JSON Schema Validation
osbuild uses JSON Schema for manifest validation:
```python
def validate_manifest(manifest, schema):
"""Validate manifest against JSON schema"""
validator = jsonschema.Draft7Validator(schema)
errors = list(validator.iter_errors(manifest))
return ValidationResult(errors)
```
### Manifest Structure
```json
{
"version": "2",
"pipelines": [
{
"name": "build",
"runner": "org.osbuild.linux",
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": {
"suite": "bookworm",
"mirror": "https://deb.debian.org/debian",
"variant": "minbase"
}
}
]
}
],
"assembler": {
"name": "org.osbuild.qemu",
"options": {
"format": "qcow2",
"filename": "debian.qcow2",
"size": "10G",
"ptuuid": "12345678-1234-1234-1234-123456789012"
}
}
}
```
### Template Processing
osbuild supports manifest templating through external tools:
```bash
# Example with jq for dynamic manifest generation
jq --arg size "$IMAGE_SIZE" --arg format "$IMAGE_FORMAT" '
.assembler.options.size = $size |
.assembler.options.format = $format
' template.json > manifest.json
```
## Build Execution Engine
### Complete Build Execution System
#### **BuildRoot Architecture**
```python
class BuildRoot(contextlib.AbstractContextManager):
"""Build Root
This class implements a context-manager that maintains a root file-system
for contained environments. When entering the context, the required
file-system setup is performed, and it is automatically torn down when
exiting.
"""
def __init__(self, root, runner, libdir, var, *, rundir="/run/osbuild"):
self._exitstack = None
self._rootdir = root
self._rundir = rundir
self._vardir = var
self._libdir = libdir
self._runner = runner
self._apis = []
self.dev = None
self.var = None
self.proc = None
self.tmp = None
self.mount_boot = True
self.caps = None
```
#### **BuildRoot Setup Process**
```python
def __enter__(self):
self._exitstack = contextlib.ExitStack()
with self._exitstack:
# Create temporary directories
dev = tempfile.TemporaryDirectory(prefix="osbuild-dev-", dir=self._rundir)
self.dev = self._exitstack.enter_context(dev)
tmp = tempfile.TemporaryDirectory(prefix="osbuild-tmp-", dir=self._vardir)
self.tmp = self._exitstack.enter_context(tmp)
# Set up device nodes
self._mknod(self.dev, "full", 0o666, 1, 7)
self._mknod(self.dev, "null", 0o666, 1, 3)
self._mknod(self.dev, "random", 0o666, 1, 8)
self._mknod(self.dev, "urandom", 0o666, 1, 9)
self._mknod(self.dev, "tty", 0o666, 5, 0)
self._mknod(self.dev, "zero", 0o666, 1, 5)
# Mount tmpfs for /dev
subprocess.run(["mount", "-t", "tmpfs", "-o", "nosuid", "none", self.dev], check=True)
self._exitstack.callback(lambda: subprocess.run(["umount", "--lazy", self.dev], check=True))
# Prepare all registered API endpoints
for api in self._apis:
self._exitstack.enter_context(api)
self._exitstack = self._exitstack.pop_all()
return self
```
#### **Stage Execution Process**
```python
def execute_stage(stage, context):
"""Execute a single stage"""
try:
# 1. Prepare stage environment
stage.setup(context)
# 2. Set up buildroot
with buildroot.BuildRoot(build_tree, runner.path, libdir, store.tmp) as build_root:
# 3. Configure capabilities
build_root.caps = DEFAULT_CAPABILITIES | stage.info.caps
# 4. Set up mounts and devices
for name, mount in stage.mounts.items():
mount_data = mount_manager.mount(mount)
mounts[name] = mount_data
# 5. Prepare arguments
args = {
"tree": "/run/osbuild/tree",
"paths": {
"devices": devices_mapped,
"inputs": inputs_mapped,
"mounts": mounts_mounted,
},
"devices": devices,
"inputs": inputs,
"mounts": mounts,
}
# 6. Execute stage
result = build_root.run([f"/run/osbuild/bin/{stage.name}"],
monitor,
timeout=timeout,
binds=binds,
readonly_binds=ro_binds,
extra_env=extra_env,
debug_shell=debug_shell)
# 7. Process output
context.store_object(stage.id, result)
return result
except Exception as e:
# Handle errors
context.mark_failed(stage.id, str(e))
raise
```
#### **Command Execution in BuildRoot**
```python
def run(self, argv, monitor, timeout=None, binds=None, readonly_binds=None, extra_env=None, debug_shell=False):
"""Runs a command in the buildroot.
Takes the command and arguments, as well as bind mounts to mirror
in the build-root for this command.
"""
if not self._exitstack:
raise RuntimeError("No active context")
stage_name = os.path.basename(argv[0])
mounts = []
# Import directories from the caller-provided root
imports = ["usr"]
if self.mount_boot:
imports.append("boot")
# Build bubblewrap command
bwrap_cmd = [
"bwrap",
"--dev-bind", "/", "/",
"--proc", self.proc,
"--dev", self.dev,
"--var", self.var,
"--tmp", self.tmp,
"--chdir", "/",
]
# Add bind mounts
for bind in binds or []:
bwrap_cmd.extend(["--bind"] + bind.split(":", 1))
# Add readonly bind mounts
for bind in readonly_binds or []:
bwrap_cmd.extend(["--ro-bind"] + bind.split(":", 1))
# Add environment variables
if extra_env:
for key, value in extra_env.items():
bwrap_cmd.extend(["--setenv", key, value])
# Add command
bwrap_cmd.extend(argv)
# Execute with bubblewrap
result = subprocess.run(bwrap_cmd,
capture_output=True,
text=True,
timeout=timeout)
return CompletedBuild(result, result.stdout + result.stderr)
```
#### **Process Isolation and Security**
```python
DEFAULT_CAPABILITIES = {
"CAP_AUDIT_WRITE",
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_DAC_READ_SEARCH",
"CAP_FOWNER",
"CAP_FSETID",
"CAP_IPC_LOCK",
"CAP_LINUX_IMMUTABLE",
"CAP_MAC_OVERRIDE",
"CAP_MKNOD",
"CAP_NET_BIND_SERVICE",
"CAP_SETFCAP",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETUID",
"CAP_SYS_ADMIN",
"CAP_SYS_CHROOT",
"CAP_SYS_NICE",
"CAP_SYS_RESOURCE"
}
def drop_capabilities(caps_to_keep):
"""Drop all capabilities except those specified"""
import ctypes
from ctypes import c_int, c_uint
libc = ctypes.CDLL("libc.so.6")
# Get current capabilities
caps = c_uint()
libc.cap_get_proc(ctypes.byref(caps))
# Drop unwanted capabilities
for cap in ALL_CAPABILITIES - caps_to_keep:
libc.cap_drop(caps, cap)
# Set new capabilities
libc.cap_set_proc(ctypes.byref(caps))
```
### Build Process Flow
1. **Manifest Loading**: Parse and validate JSON manifest
2. **Pipeline Construction**: Build stage dependency graph
3. **Source Resolution**: Download and prepare input sources
4. **Stage Execution**: Run stages in dependency order
5. **Assembly**: Create final artifacts from stage outputs
6. **Output**: Export requested objects
### Build Environment
```python
class BuildRoot:
def __init__(self, path, runner):
self.path = path
self.runner = runner
self.mounts = []
self.devices = []
def setup(self):
"""Set up build environment"""
# Create build directory
# Set up isolation
# Mount required directories
def cleanup(self):
"""Clean up build environment"""
# Unmount directories
# Remove temporary files
```
### Stage Execution
```python
def execute_stage(stage, context):
"""Execute a single stage"""
try:
# Prepare stage environment
stage.setup(context)
# Execute stage
result = stage.run(context)
# Process output
context.store_object(stage.id, result)
return result
except Exception as e:
# Handle errors
context.mark_failed(stage.id, str(e))
raise
```
## Object Store and Caching
### Object Store Architecture
```python
class ObjectStore:
def __init__(self, path):
self.path = path
self.objects = {}
def store_object(self, obj_id, obj):
"""Store object in object store"""
obj_path = os.path.join(self.path, obj_id)
os.makedirs(obj_path, exist_ok=True)
# Store object metadata and data
with open(os.path.join(obj_path, "meta.json"), "w") as f:
json.dump(obj.meta, f)
obj.export(obj_path)
def get_object(self, obj_id):
"""Retrieve object from store"""
if obj_id in self.objects:
return self.objects[obj_id]
obj_path = os.path.join(self.path, obj_id)
if os.path.exists(obj_path):
obj = self.load_object(obj_path)
self.objects[obj_id] = obj
return obj
return None
```
### Caching Strategy
1. **Object-level caching**: Store stage outputs by ID
2. **Dependency tracking**: Reuse objects when dependencies haven't changed
3. **Incremental builds**: Skip stages with unchanged inputs
4. **Checkpoint support**: Save intermediate results for debugging
### Cache Management
```python
def manage_cache(store, max_size=None):
"""Manage object store cache size"""
if max_size is None:
return
# Calculate current cache size
current_size = calculate_cache_size(store.path)
if current_size > max_size:
# Remove least recently used objects
remove_lru_objects(store, current_size - max_size)
```
## Security and Isolation
### Process Isolation
osbuild uses multiple isolation mechanisms:
#### **Bubblewrap**
```python
def run_isolated(cmd, cwd=None, env=None):
"""Run command with bubblewrap isolation"""
bwrap_cmd = [
"bwrap",
"--dev-bind", "/", "/",
"--proc", "/proc",
"--dev", "/dev",
"--chdir", cwd or "/"
] + cmd
return run_command(bwrap_cmd, env=env)
```
#### **Systemd-nspawn**
```python
def run_containerized(cmd, tree, env=None):
"""Run command in systemd-nspawn container"""
nspawn_cmd = [
"systemd-nspawn",
"--directory", tree,
"--bind", "/dev", "/dev",
"--bind", "/proc", "/proc",
"--bind", "/sys", "/sys"
] + cmd
return run_command(nspawn_cmd, env=env)
```
### Capability Management
```python
DEFAULT_CAPABILITIES = {
"CAP_AUDIT_WRITE",
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_DAC_READ_SEARCH",
"CAP_FOWNER",
"CAP_FSETID",
"CAP_IPC_LOCK",
"CAP_LINUX_IMMUTABLE",
"CAP_MAC_OVERRIDE",
"CAP_MKNOD",
"CAP_NET_BIND_SERVICE",
"CAP_SETFCAP",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETUID",
"CAP_SYS_ADMIN",
"CAP_SYS_CHROOT",
"CAP_SYS_NICE",
"CAP_SYS_RESOURCE"
}
```
### Security Considerations
1. **Process isolation**: Prevent host system contamination
2. **Capability dropping**: Limit process privileges
3. **Resource limits**: Prevent resource exhaustion
4. **Input validation**: Validate all external inputs
5. **Output sanitization**: Ensure safe output generation
## Integration Points
### CLI Interface
#### **Main Entry Point** (`main_cli.py`)
```python
def osbuild_cli():
"""Main CLI entry point"""
args = parse_arguments(sys.argv[1:])
# Load manifest
manifest = parse_manifest(args.manifest_path)
# Validate manifest
result = validate_manifest(manifest)
if not result:
show_validation(result, args.manifest_path)
return 1
# Execute build
store = ObjectStore(args.cache)
result = build_manifest(manifest, store)
# Export results
if args.export:
for export_id in args.export:
export(export_id, args.output_directory, store, manifest)
return 0
```
#### **Command Line Options**
```bash
osbuild [OPTIONS] MANIFEST
Options:
--cache DIR Cache directory (default: .osbuild)
--libdir DIR Library directory (default: /usr/lib/osbuild)
--cache-max-size SIZE Maximum cache size
--checkpoint ID Stage to checkpoint
--export ID Object to export
--output-directory DIR Output directory
--monitor NAME Monitor to use
--stage-timeout SECONDS Stage timeout
```
### API Interface
#### **Python API** (`api.py`)
```python
def build_manifest(manifest, store, libdir=None):
"""Build manifest using object store"""
# Load stages and assemblers
# Execute pipeline
# Return build result
```
#### **REST API** (Future)
```python
@app.route('/api/v1/build', methods=['POST'])
def build_manifest_api():
"""REST API for manifest building"""
manifest = request.json
result = build_manifest(manifest, store)
return jsonify(result)
```
### External Tool Integration
#### **Container Integration**
```bash
# Docker
docker run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json
# Podman
podman run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json
```
#### **CI/CD Integration**
```yaml
# GitHub Actions example
- name: Build OS Image
run: |
osbuild \
--cache .osbuild \
--output-directory outputs \
manifest.json
```
#### **Monitoring Integration**
```python
class Monitor:
def __init__(self, name):
self.name = name
def stage_started(self, stage):
"""Called when stage starts"""
pass
def stage_completed(self, stage, result):
"""Called when stage completes"""
pass
def stage_failed(self, stage, error):
"""Called when stage fails"""
pass
```
## Advanced Features
### Multi-Architecture Support
osbuild supports multiple architectures through stage variants:
```json
{
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": {
"suite": "bookworm",
"arch": "arm64"
}
}
]
}
```
### Parallel Execution
Stages can execute in parallel when dependencies allow:
```python
def execute_parallel(stages, context):
"""Execute independent stages in parallel"""
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = {
executor.submit(execute_stage, stage, context): stage
for stage in stages
}
for future in concurrent.futures.as_completed(futures):
stage = futures[future]
try:
result = future.result()
context.store_object(stage.id, result)
except Exception as e:
context.mark_failed(stage.id, str(e))
```
### Checkpoint and Resume
```python
def checkpoint_stage(stage, context):
"""Checkpoint stage execution"""
if stage.checkpoint:
# Save stage state
checkpoint_path = os.path.join(context.store.path, f"{stage.id}.checkpoint")
stage.save_checkpoint(checkpoint_path)
# Store checkpoint metadata
context.store.store_object(f"{stage.id}.checkpoint", {
"type": "checkpoint",
"stage_id": stage.id,
"timestamp": time.time()
})
```
### Remote Execution
```python
class RemoteExecutor:
def __init__(self, host, user=None, key_file=None):
self.host = host
self.user = user
self.key_file = key_file
def execute_stage(self, stage, context):
"""Execute stage on remote host"""
# Copy stage to remote host
# Execute remotely
# Retrieve results
pass
```
## Performance Characteristics
### Build Time Optimization
1. **Parallel execution**: Independent stages run concurrently
2. **Object caching**: Reuse unchanged stage outputs
3. **Incremental builds**: Skip stages with unchanged inputs
4. **Resource allocation**: Optimize memory and CPU usage
### Resource Usage
```python
def optimize_resources(stages, available_memory, available_cpus):
"""Optimize resource allocation for stages"""
# Calculate stage resource requirements
# Allocate resources optimally
# Prevent resource contention
```
### Benchmarking
```python
def benchmark_build(manifest, iterations=5):
"""Benchmark build performance"""
times = []
for i in range(iterations):
start_time = time.time()
result = build_manifest(manifest, store)
end_time = time.time()
times.append(end_time - start_time)
return {
"mean": statistics.mean(times),
"median": statistics.median(times),
"std": statistics.stdev(times),
"min": min(times),
"max": max(times)
}
```
## Conclusion
osbuild represents a sophisticated, production-ready build system for operating system artifacts. Its architecture emphasizes:
1. **Reproducibility**: Consistent results through declarative manifests
2. **Extensibility**: Pluggable stages and assemblers
3. **Performance**: Optimized execution and caching
4. **Security**: Process isolation and capability management
5. **Integration**: Easy integration with existing toolchains
### Key Strengths
- **Structured approach**: Clear separation of concerns
- **Extensible architecture**: Easy to add new stages and assemblers
- **Performance optimization**: Efficient caching and parallel execution
- **Security focus**: Built-in isolation and capability management
- **Distribution support**: Works across multiple Linux distributions
### Areas for Enhancement
- **Bootloader integration**: Limited built-in bootloader support
- **Package management**: Focus on RPM-based systems
- **Image formats**: Limited output format support
- **Validation**: Basic manifest validation capabilities
### Use Cases
1. **Distribution building**: Creating official distribution images
2. **Custom images**: Building specialized OS images
3. **CI/CD pipelines**: Automated image building
4. **Development**: Testing and development environments
5. **Production deployment**: Creating production-ready images
## Complete Workflow Examples
### Example 1: Basic Debian System Image
#### **Manifest Definition**
```json
{
"version": "2",
"pipelines": [
{
"name": "build",
"runner": "org.osbuild.linux",
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": {
"suite": "bookworm",
"mirror": "https://deb.debian.org/debian",
"variant": "minbase"
}
},
{
"name": "org.osbuild.apt",
"options": {
"packages": ["sudo", "openssh-server", "systemd-sysv"]
}
},
{
"name": "org.osbuild.users",
"options": {
"users": {
"debian": {
"password": "$6$rounds=656000$...",
"shell": "/bin/bash",
"groups": ["sudo"]
}
}
}
}
]
}
],
"assembler": {
"name": "org.osbuild.tar",
"options": {
"filename": "debian-basic.tar.gz",
"compression": "gzip"
}
}
}
```
#### **Complete Execution Flow**
1. **Manifest Loading**: Parse JSON manifest and validate schema
2. **Pipeline Construction**: Build dependency graph for 3 stages
3. **Source Resolution**: Download Debian packages and sources
4. **Stage Execution**:
- `debootstrap`: Create base Debian filesystem
- `apt`: Install packages and dependencies
- `users`: Create user accounts and groups
5. **Assembly**: Create compressed tar archive
6. **Output**: Generate `debian-basic.tar.gz`
### Example 2: Bootable QEMU Disk Image
#### **Manifest Definition**
```json
{
"version": "2",
"pipelines": [
{
"name": "build",
"runner": "org.osbuild.linux",
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": {
"suite": "bookworm",
"variant": "minbase"
}
},
{
"name": "org.osbuild.apt",
"options": {
"packages": ["grub2-efi-amd64", "efibootmgr", "linux-image-amd64"]
}
},
{
"name": "org.osbuild.grub2",
"options": {
"root_fs_uuid": "6e4ff95f-f662-45ee-a82a-bdf44a2d0b75",
"uefi": {
"vendor": "debian",
"unified": true
}
}
}
]
}
],
"assembler": {
"name": "org.osbuild.qemu",
"options": {
"format": "qcow2",
"filename": "debian-bootable.qcow2",
"size": "10G",
"ptuuid": "12345678-1234-1234-1234-123456789012",
"partitions": [
{
"name": "esp",
"start": 1048576,
"size": 268435456,
"type": "fat32",
"mountpoint": "/boot/efi"
},
{
"name": "root",
"start": 269484032,
"size": 10485760000,
"type": "ext4",
"mountpoint": "/"
}
]
}
}
}
```
#### **Complete Execution Flow**
1. **Manifest Loading**: Parse JSON manifest and validate schema
2. **Pipeline Construction**: Build dependency graph for 3 stages
3. **Source Resolution**: Download Debian packages and GRUB components
4. **Stage Execution**:
- `debootstrap`: Create base Debian filesystem
- `apt`: Install GRUB and kernel packages
- `grub2`: Configure GRUB bootloader
5. **Assembly**: Create QCOW2 disk image with partitions
6. **Output**: Generate `debian-bootable.qcow2`
### Example 3: OSTree-Based System
#### **Manifest Definition**
```json
{
"version": "2",
"pipelines": [
{
"name": "build",
"runner": "org.osbuild.linux",
"stages": [
{
"name": "org.osbuild.debian.debootstrap",
"options": {
"suite": "bookworm",
"variant": "minbase"
}
},
{
"name": "org.osbuild.apt",
"options": {
"packages": ["ostree", "systemd", "systemd-sysv"]
}
},
{
"name": "org.osbuild.ostree",
"options": {
"repository": "/var/lib/ostree/repo",
"branch": "debian/bookworm/x86_64/standard"
}
}
]
}
],
"assembler": {
"name": "org.osbuild.ostree.commit",
"options": {
"repository": "debian-ostree",
"branch": "debian/bookworm/x86_64/standard"
}
}
}
```
#### **Complete Execution Flow**
1. **Manifest Loading**: Parse JSON manifest and validate schema
2. **Pipeline Construction**: Build dependency graph for 3 stages
3. **Source Resolution**: Download Debian packages and OSTree
4. **Stage Execution**:
- `debootstrap`: Create base Debian filesystem
- `apt`: Install OSTree and systemd packages
- `ostree`: Configure OSTree repository
5. **Assembly**: Create OSTree commit
6. **Output**: Generate OSTree repository with commit
## Conclusion
osbuild provides a solid foundation for building operating system images with a focus on reproducibility, performance, and extensibility. Its stage-based architecture makes it easy to customize and extend while maintaining consistency and reliability.
### Key Strengths
- **Structured approach**: Clear separation of concerns with stages and assemblers
- **Extensible architecture**: Easy to add new stages and assemblers
- **Performance optimization**: Efficient caching and parallel execution
- **Security focus**: Built-in isolation and capability management
- **Distribution support**: Works across multiple Linux distributions
- **Declarative manifests**: JSON-based configuration with schema validation
- **Process isolation**: Bubblewrap and systemd-nspawn integration
- **Object caching**: Intelligent caching of stage outputs
### Areas for Enhancement
- **Bootloader integration**: Limited built-in bootloader support
- **Package management**: Focus on RPM-based systems
- **Image formats**: Limited output format support
- **Validation**: Basic manifest validation capabilities
- **Template support**: No built-in templating system
- **Cross-architecture**: Limited architecture support
### Complete Process Summary
osbuild implements a **complete end-to-end image building pipeline** that:
1. **Processes Manifests**: JSON with schema validation
2. **Manages Stages**: Atomic, composable building blocks
3. **Executes Builds**: Isolated execution with security controls
4. **Handles Objects**: Intelligent caching and storage
5. **Manages Devices**: Loop devices and partition management
6. **Provides Assembly**: Multiple output format support
7. **Ensures Security**: Process isolation and capability dropping
8. **Generates Artifacts**: Images, archives, and repositories
The system's architecture emphasizes **reproducibility**, **security**, and **extensibility** while maintaining **performance** through intelligent caching and isolated execution environments.