deb-osbuild/docs/osbuild.md
robojerk 544eb61951 docs: Reorganize documentation into proper docs/ directory structure
- Moved all documentation files to docs/ directory for better organization
- Maintained all existing documentation content
- Improved project structure for better maintainability
- Documentation now follows standard open source project layout
2025-08-12 00:27:41 -07:00

38 KiB

osbuild Comprehensive Top-to-Bottom Analysis

Overview

osbuild is a pipeline-based build system for operating system artifacts that defines a universal pipeline description and execution engine. It produces artifacts like operating system images through a structured, stage-based approach that emphasizes reproducibility and extensibility. This document provides a complete top-to-bottom analysis of the entire osbuild process, from manifest parsing to final artifact generation.

Table of Contents

  1. Complete Process Flow
  2. Core Architecture
  3. Pipeline System
  4. Stage System
  5. Assembler System
  6. Build Execution Engine
  7. Object Store and Caching
  8. Security and Isolation
  9. External Tools and Dependencies
  10. Manifest Processing
  11. Integration Points
  12. Complete Workflow Examples

Complete Process Flow

End-to-End Workflow

1. Manifest Loading → 2. Schema Validation → 3. Pipeline Construction → 4. Stage Execution → 5. Object Storage → 6. Assembly → 7. Artifact Generation → 8. Cleanup

Detailed Process Steps

Phase 1: Manifest Processing

  1. File Loading: Read JSON manifest file from disk or stdin
  2. Schema Validation: Validate against JSON schemas for stages and assemblers
  3. Pipeline Construction: Build stage dependency graph and execution order
  4. Source Resolution: Download and prepare input sources
  5. Configuration: Set up build environment and options

Phase 2: Build Environment Setup

  1. BuildRoot Creation: Initialize isolated build environment
  2. Resource Allocation: Set up temporary directories and mounts
  3. Capability Management: Configure process capabilities and security
  4. API Registration: Set up communication endpoints
  5. Device Management: Configure loop devices and partitions

Phase 3: Stage Execution

  1. Dependency Resolution: Execute stages in dependency order
  2. Input Processing: Map input objects to stage requirements
  3. Environment Setup: Mount filesystems and configure devices
  4. Stage Execution: Run stage scripts in isolated environment
  5. Output Collection: Store stage results in object store

Phase 4: Assembly and Output

  1. Object Collection: Gather all stage outputs
  2. Assembly Execution: Run assembler to create final artifact
  3. Format Conversion: Convert to requested output format
  4. Artifact Generation: Create final image or archive
  5. Cleanup: Remove temporary files and mounts

Process Architecture Overview

osbuild CLI → Manifest Parser → Pipeline Builder → Stage Executor → Object Store → Assembler → Final Artifact
     ↓              ↓                ↓               ↓            ↓          ↓         ↓
  Main Entry   JSON Schema    Dependency Graph   Stage Runner   Cache      Output Gen  Image/Archive

Core Architecture

Design Philosophy

osbuild follows a declarative pipeline architecture where:

  • Manifests define the complete build process as JSON
  • Stages are atomic, composable building blocks
  • Assemblers create final artifacts from stage outputs
  • Pipelines orchestrate stage execution and data flow

Key Components

osbuild CLI → Manifest Parser → Pipeline Executor → Stage Runner → Assembler → Artifacts
     ↓              ↓                ↓               ↓            ↓          ↓
  Main Entry   JSON Schema    Pipeline Builder   Stage Exec   Output Gen   Final Files

Architecture Principles

  1. Stages are never broken, only deprecated - Same manifest always produces same output
  2. Explicit over implicit - No reliance on tree state
  3. Pipeline independence - Tree is empty at beginning of each pipeline
  4. Machine-generated manifests - No convenience functions for manual creation
  5. Confined build environment - Security against accidental misuse
  6. Distribution compatibility - Python 3.6+ support

Pipeline System

Pipeline Structure

A pipeline is a directed acyclic graph (DAG) of stages:

{
  "pipeline": {
    "build": {
      "stages": [
        {
          "name": "org.osbuild.debian.debootstrap",
          "options": { ... }
        },
        {
          "name": "org.osbuild.apt",
          "options": { ... }
        }
      ]
    },
    "assembler": {
      "name": "org.osbuild.qemu",
      "options": { ... }
    }
  }
}

Pipeline Execution Model

class Pipeline:
    def __init__(self, info, source_options, build, base, options, source_epoch):
        self.info = info
        self.sources = source_options
        self.build = build
        self.base = base
        self.options = options
        self.source_epoch = source_epoch
        self.checkpoint = False
        self.inputs = {}
        self.devices = {}
        self.mounts = {}

Pipeline Lifecycle

  1. Initialization: Load manifest and validate schema
  2. Preparation: Set up build environment and dependencies
  3. Execution: Run stages in dependency order
  4. Assembly: Create final artifacts from stage outputs
  5. Cleanup: Remove temporary files and resources

Stage System

Stage Architecture

Stages are the core building blocks of osbuild:

class Stage:
    def __init__(self, info, source_options, build, base, options, source_epoch):
        self.info = info          # Stage metadata
        self.sources = source_options  # Input sources
        self.build = build        # Build configuration
        self.base = base          # Base tree
        self.options = options    # Stage-specific options
        self.source_epoch = source_epoch  # Source timestamp
        self.checkpoint = False   # Checkpoint flag
        self.inputs = {}          # Input objects
        self.devices = {}         # Device configurations
        self.mounts = {}          # Mount configurations

Stage Types

System Construction Stages

  • org.osbuild.debian.debootstrap: Debian base system creation
  • org.osbuild.rpm: RPM package installation
  • org.osbuild.ostree: OSTree repository management
  • org.osbuild.apt: Debian package management

Filesystem Stages

  • org.osbuild.overlay: File/directory copying
  • org.osbuild.mkdir: Directory creation
  • org.osbuild.copy: File copying operations
  • org.osbuild.symlink: Symbolic link creation

Configuration Stages

  • org.osbuild.users: User account management
  • org.osbuild.fstab: Filesystem table configuration
  • org.osbuild.locale: Locale configuration
  • org.osbuild.timezone: Timezone setup

Bootloader Stages

  • org.osbuild.grub2: GRUB2 bootloader configuration
  • org.osbuild.bootupd: Modern bootloader management
  • org.osbuild.zipl: S390x bootloader

Image Creation Stages

  • org.osbuild.image: Raw image creation
  • org.osbuild.qemu: QEMU image assembly
  • org.osbuild.tar: Archive creation
  • org.osbuild.oci-archive: OCI container images

Stage Implementation Pattern

Each stage follows a consistent pattern:

#!/usr/bin/python3
import os
import sys
import osbuild.api

def main(tree, options):
    # Stage-specific logic
    # Process options
    # Manipulate filesystem tree
    return 0

if __name__ == '__main__':
    args = osbuild.api.arguments()
    ret = main(args["tree"], args["options"])
    sys.exit(ret)

Key Stages Deep Dive

GRUB2 Stage (stages/org.osbuild.grub2.iso)

Purpose: Configure GRUB2 bootloader for ISO images Implementation: Template-based GRUB configuration generation

GRUB2_EFI_CFG_TEMPLATE = """$defaultentry
function load_video {
  insmod efi_gop
  insmod efi_uga
  insmod video_bochs
  insmod video_cirrus
  insmod all_video
}

load_video
set gfxpayload=keep
insmod gzio
insmod part_gpt
insmod ext2

set timeout=${timeout}
search --no-floppy --set=root -l '${isolabel}'

menuentry 'Install ${product} ${version}' --class fedora --class gnu-linux --class gnu --class os {
    linux ${kernelpath} ${root} quiet
    initrd ${initrdpath}
}
"""

External Tools Used:

  • shim*.efi: Secure boot components
  • grub*.efi: GRUB bootloader binaries
  • unicode.pf2: GRUB font files

bootupd Stage (stages/org.osbuild.bootupd.gen-metadata)

Purpose: Generate bootupd update metadata Implementation: Chroot execution of bootupctl

def main(tree):
    with MountGuard() as mounter:
        # Mount essential directories
        for source in ("/dev", "/sys", "/proc"):
            target = os.path.join(tree, source.lstrip("/"))
            os.makedirs(target, exist_ok=True)
            mounter.mount(source, target, permissions=MountPermissions.READ_ONLY)
        
        # Execute bootupctl in chroot
        cmd = ['chroot', tree, '/usr/bin/bootupctl', 'backend', 'generate-update-metadata', '/']
        subprocess.run(cmd, check=True)
    
    return 0

External Tools Used:

  • chroot: Filesystem isolation
  • bootupctl: bootupd control utility
  • mount: Directory mounting

QEMU Assembler (assemblers/org.osbuild.qemu)

Purpose: Create bootable disk images Implementation: Comprehensive disk image creation with bootloader support

def main(tree, options):
    # Create image file
    # Partition using sfdisk
    # Format filesystems
    # Copy tree contents
    # Install bootloader
    # Convert to requested format

External Tools Used:

  • truncate: File size management
  • sfdisk: Partition table creation
  • mkfs.ext4, mkfs.xfs: Filesystem formatting
  • mount, umount: Partition mounting
  • grub2-mkimage: GRUB image creation
  • qemu-img: Image format conversion

Assembler System

Assembler Types

Image Assemblers

  • org.osbuild.qemu: Bootable disk images (raw, qcow2, vmdk, etc.)
  • org.osbuild.rawfs: Raw filesystem images
  • org.osbuild.tar: Compressed archives
  • org.osbuild.oci-archive: OCI container images

Specialized Assemblers

  • org.osbuild.ostree.commit: OSTree repository commits
  • org.osbuild.error: Error reporting
  • org.osbuild.noop: No-operation assembler

Assembler Implementation

class QemuAssembler:
    def __init__(self, options):
        self.options = options
        self.format = options["format"]
        self.filename = options["filename"]
        self.size = options["size"]
        self.bootloader = options.get("bootloader", {})
    
    def assemble(self, tree):
        # Create image file
        # Set up partitions
        # Copy filesystem contents
        # Install bootloader
        # Convert to target format
        return result

External Tools and Dependencies

Core System Tools

Filesystem Management

  • parted: Partition table management
  • sfdisk: Scriptable partitioning
  • mkfs.ext4, mkfs.xfs, mkfs.fat: Filesystem creation
  • mount, umount: Filesystem mounting
  • losetup: Loop device management
  • truncate: File size manipulation

Package Management

  • rpm: RPM package management
  • yum, dnf: Package manager frontends
  • apt, apt-get: Debian package management
  • pacman: Arch Linux package management

Bootloader Tools

  • grub2-install: GRUB2 installation
  • grub2-mkimage: GRUB2 image creation
  • bootupctl: bootupd control utility
  • zipl: S390x bootloader

Image and Archive Tools

  • qemu-img: Image format conversion
  • tar: Archive creation and extraction
  • gzip, bzip2, xz: Compression
  • skopeo: Container image operations

System Tools

  • bubblewrap: Process isolation
  • systemd-nspawn: Container management
  • chroot: Filesystem isolation
  • curl: Network file transfer

Build System Dependencies

Python Dependencies (requirements.txt)

jsonschema

System Dependencies

  • python >= 3.6: Python runtime
  • bubblewrap >= 0.4.0: Process isolation
  • bash >= 5.0: Shell execution
  • coreutils >= 8.31: Core utilities
  • curl >= 7.68: Network operations
  • qemu-img >= 4.2.0: Image manipulation
  • rpm >= 4.15: RPM package management
  • tar >= 1.32: Archive operations
  • util-linux >= 235: System utilities
  • skopeo: Container operations
  • python3-librepo: Repository access

External Tool Integration Points

Command Execution

def run_command(cmd, cwd=None, env=None):
    """Execute external command with proper environment"""
    result = subprocess.run(
        cmd,
        cwd=cwd,
        env=env,
        capture_output=True,
        text=True,
        check=True
    )
    return result

Chroot Execution

def chroot_execute(tree, cmd):
    """Execute command in chroot environment"""
    chroot_cmd = ['chroot', tree] + cmd
    return run_command(chroot_cmd)

Mount Management

class MountGuard:
    """Context manager for mount operations"""
    def __enter__(self):
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.cleanup()
    
    def mount(self, source, target, permissions=None):
        # Mount with proper permissions
        pass

Manifest Processing

JSON Schema Validation

osbuild uses JSON Schema for manifest validation:

def validate_manifest(manifest, schema):
    """Validate manifest against JSON schema"""
    validator = jsonschema.Draft7Validator(schema)
    errors = list(validator.iter_errors(manifest))
    return ValidationResult(errors)

Manifest Structure

{
  "version": "2",
  "pipelines": [
    {
      "name": "build",
      "runner": "org.osbuild.linux",
      "stages": [
        {
          "name": "org.osbuild.debian.debootstrap",
          "options": {
            "suite": "bookworm",
            "mirror": "https://deb.debian.org/debian",
            "variant": "minbase"
          }
        }
      ]
    }
  ],
  "assembler": {
    "name": "org.osbuild.qemu",
    "options": {
      "format": "qcow2",
      "filename": "debian.qcow2",
      "size": "10G",
      "ptuuid": "12345678-1234-1234-1234-123456789012"
    }
  }
}

Template Processing

osbuild supports manifest templating through external tools:

# Example with jq for dynamic manifest generation
jq --arg size "$IMAGE_SIZE" --arg format "$IMAGE_FORMAT" '
  .assembler.options.size = $size |
  .assembler.options.format = $format
' template.json > manifest.json

Build Execution Engine

Complete Build Execution System

BuildRoot Architecture

class BuildRoot(contextlib.AbstractContextManager):
    """Build Root
    
    This class implements a context-manager that maintains a root file-system
    for contained environments. When entering the context, the required
    file-system setup is performed, and it is automatically torn down when
    exiting.
    """
    
    def __init__(self, root, runner, libdir, var, *, rundir="/run/osbuild"):
        self._exitstack = None
        self._rootdir = root
        self._rundir = rundir
        self._vardir = var
        self._libdir = libdir
        self._runner = runner
        self._apis = []
        self.dev = None
        self.var = None
        self.proc = None
        self.tmp = None
        self.mount_boot = True
        self.caps = None

BuildRoot Setup Process

def __enter__(self):
    self._exitstack = contextlib.ExitStack()
    with self._exitstack:
        # Create temporary directories
        dev = tempfile.TemporaryDirectory(prefix="osbuild-dev-", dir=self._rundir)
        self.dev = self._exitstack.enter_context(dev)
        
        tmp = tempfile.TemporaryDirectory(prefix="osbuild-tmp-", dir=self._vardir)
        self.tmp = self._exitstack.enter_context(tmp)
        
        # Set up device nodes
        self._mknod(self.dev, "full", 0o666, 1, 7)
        self._mknod(self.dev, "null", 0o666, 1, 3)
        self._mknod(self.dev, "random", 0o666, 1, 8)
        self._mknod(self.dev, "urandom", 0o666, 1, 9)
        self._mknod(self.dev, "tty", 0o666, 5, 0)
        self._mknod(self.dev, "zero", 0o666, 1, 5)
        
        # Mount tmpfs for /dev
        subprocess.run(["mount", "-t", "tmpfs", "-o", "nosuid", "none", self.dev], check=True)
        self._exitstack.callback(lambda: subprocess.run(["umount", "--lazy", self.dev], check=True))
        
        # Prepare all registered API endpoints
        for api in self._apis:
            self._exitstack.enter_context(api)
        
        self._exitstack = self._exitstack.pop_all()
    
    return self

Stage Execution Process

def execute_stage(stage, context):
    """Execute a single stage"""
    try:
        # 1. Prepare stage environment
        stage.setup(context)
        
        # 2. Set up buildroot
        with buildroot.BuildRoot(build_tree, runner.path, libdir, store.tmp) as build_root:
            # 3. Configure capabilities
            build_root.caps = DEFAULT_CAPABILITIES | stage.info.caps
            
            # 4. Set up mounts and devices
            for name, mount in stage.mounts.items():
                mount_data = mount_manager.mount(mount)
                mounts[name] = mount_data
            
            # 5. Prepare arguments
            args = {
                "tree": "/run/osbuild/tree",
                "paths": {
                    "devices": devices_mapped,
                    "inputs": inputs_mapped,
                    "mounts": mounts_mounted,
                },
                "devices": devices,
                "inputs": inputs,
                "mounts": mounts,
            }
            
            # 6. Execute stage
            result = build_root.run([f"/run/osbuild/bin/{stage.name}"],
                                   monitor,
                                   timeout=timeout,
                                   binds=binds,
                                   readonly_binds=ro_binds,
                                   extra_env=extra_env,
                                   debug_shell=debug_shell)
        
        # 7. Process output
        context.store_object(stage.id, result)
        
        return result
        
    except Exception as e:
        # Handle errors
        context.mark_failed(stage.id, str(e))
        raise

Command Execution in BuildRoot

def run(self, argv, monitor, timeout=None, binds=None, readonly_binds=None, extra_env=None, debug_shell=False):
    """Runs a command in the buildroot.
    
    Takes the command and arguments, as well as bind mounts to mirror
    in the build-root for this command.
    """
    
    if not self._exitstack:
        raise RuntimeError("No active context")
    
    stage_name = os.path.basename(argv[0])
    mounts = []
    
    # Import directories from the caller-provided root
    imports = ["usr"]
    if self.mount_boot:
        imports.append("boot")
    
    # Build bubblewrap command
    bwrap_cmd = [
        "bwrap",
        "--dev-bind", "/", "/",
        "--proc", self.proc,
        "--dev", self.dev,
        "--var", self.var,
        "--tmp", self.tmp,
        "--chdir", "/",
    ]
    
    # Add bind mounts
    for bind in binds or []:
        bwrap_cmd.extend(["--bind"] + bind.split(":", 1))
    
    # Add readonly bind mounts
    for bind in readonly_binds or []:
        bwrap_cmd.extend(["--ro-bind"] + bind.split(":", 1))
    
    # Add environment variables
    if extra_env:
        for key, value in extra_env.items():
            bwrap_cmd.extend(["--setenv", key, value])
    
    # Add command
    bwrap_cmd.extend(argv)
    
    # Execute with bubblewrap
    result = subprocess.run(bwrap_cmd,
                           capture_output=True,
                           text=True,
                           timeout=timeout)
    
    return CompletedBuild(result, result.stdout + result.stderr)

Process Isolation and Security

DEFAULT_CAPABILITIES = {
    "CAP_AUDIT_WRITE",
    "CAP_CHOWN",
    "CAP_DAC_OVERRIDE",
    "CAP_DAC_READ_SEARCH",
    "CAP_FOWNER",
    "CAP_FSETID",
    "CAP_IPC_LOCK",
    "CAP_LINUX_IMMUTABLE",
    "CAP_MAC_OVERRIDE",
    "CAP_MKNOD",
    "CAP_NET_BIND_SERVICE",
    "CAP_SETFCAP",
    "CAP_SETGID",
    "CAP_SETPCAP",
    "CAP_SETUID",
    "CAP_SYS_ADMIN",
    "CAP_SYS_CHROOT",
    "CAP_SYS_NICE",
    "CAP_SYS_RESOURCE"
}

def drop_capabilities(caps_to_keep):
    """Drop all capabilities except those specified"""
    import ctypes
    from ctypes import c_int, c_uint
    
    libc = ctypes.CDLL("libc.so.6")
    
    # Get current capabilities
    caps = c_uint()
    libc.cap_get_proc(ctypes.byref(caps))
    
    # Drop unwanted capabilities
    for cap in ALL_CAPABILITIES - caps_to_keep:
        libc.cap_drop(caps, cap)
    
    # Set new capabilities
    libc.cap_set_proc(ctypes.byref(caps))

Build Process Flow

  1. Manifest Loading: Parse and validate JSON manifest
  2. Pipeline Construction: Build stage dependency graph
  3. Source Resolution: Download and prepare input sources
  4. Stage Execution: Run stages in dependency order
  5. Assembly: Create final artifacts from stage outputs
  6. Output: Export requested objects

Build Environment

class BuildRoot:
    def __init__(self, path, runner):
        self.path = path
        self.runner = runner
        self.mounts = []
        self.devices = []
    
    def setup(self):
        """Set up build environment"""
        # Create build directory
        # Set up isolation
        # Mount required directories
    
    def cleanup(self):
        """Clean up build environment"""
        # Unmount directories
        # Remove temporary files

Stage Execution

def execute_stage(stage, context):
    """Execute a single stage"""
    try:
        # Prepare stage environment
        stage.setup(context)
        
        # Execute stage
        result = stage.run(context)
        
        # Process output
        context.store_object(stage.id, result)
        
        return result
    except Exception as e:
        # Handle errors
        context.mark_failed(stage.id, str(e))
        raise

Object Store and Caching

Object Store Architecture

class ObjectStore:
    def __init__(self, path):
        self.path = path
        self.objects = {}
    
    def store_object(self, obj_id, obj):
        """Store object in object store"""
        obj_path = os.path.join(self.path, obj_id)
        os.makedirs(obj_path, exist_ok=True)
        
        # Store object metadata and data
        with open(os.path.join(obj_path, "meta.json"), "w") as f:
            json.dump(obj.meta, f)
        
        obj.export(obj_path)
    
    def get_object(self, obj_id):
        """Retrieve object from store"""
        if obj_id in self.objects:
            return self.objects[obj_id]
        
        obj_path = os.path.join(self.path, obj_id)
        if os.path.exists(obj_path):
            obj = self.load_object(obj_path)
            self.objects[obj_id] = obj
            return obj
        
        return None

Caching Strategy

  1. Object-level caching: Store stage outputs by ID
  2. Dependency tracking: Reuse objects when dependencies haven't changed
  3. Incremental builds: Skip stages with unchanged inputs
  4. Checkpoint support: Save intermediate results for debugging

Cache Management

def manage_cache(store, max_size=None):
    """Manage object store cache size"""
    if max_size is None:
        return
    
    # Calculate current cache size
    current_size = calculate_cache_size(store.path)
    
    if current_size > max_size:
        # Remove least recently used objects
        remove_lru_objects(store, current_size - max_size)

Security and Isolation

Process Isolation

osbuild uses multiple isolation mechanisms:

Bubblewrap

def run_isolated(cmd, cwd=None, env=None):
    """Run command with bubblewrap isolation"""
    bwrap_cmd = [
        "bwrap",
        "--dev-bind", "/", "/",
        "--proc", "/proc",
        "--dev", "/dev",
        "--chdir", cwd or "/"
    ] + cmd
    
    return run_command(bwrap_cmd, env=env)

Systemd-nspawn

def run_containerized(cmd, tree, env=None):
    """Run command in systemd-nspawn container"""
    nspawn_cmd = [
        "systemd-nspawn",
        "--directory", tree,
        "--bind", "/dev", "/dev",
        "--bind", "/proc", "/proc",
        "--bind", "/sys", "/sys"
    ] + cmd
    
    return run_command(nspawn_cmd, env=env)

Capability Management

DEFAULT_CAPABILITIES = {
    "CAP_AUDIT_WRITE",
    "CAP_CHOWN",
    "CAP_DAC_OVERRIDE",
    "CAP_DAC_READ_SEARCH",
    "CAP_FOWNER",
    "CAP_FSETID",
    "CAP_IPC_LOCK",
    "CAP_LINUX_IMMUTABLE",
    "CAP_MAC_OVERRIDE",
    "CAP_MKNOD",
    "CAP_NET_BIND_SERVICE",
    "CAP_SETFCAP",
    "CAP_SETGID",
    "CAP_SETPCAP",
    "CAP_SETUID",
    "CAP_SYS_ADMIN",
    "CAP_SYS_CHROOT",
    "CAP_SYS_NICE",
    "CAP_SYS_RESOURCE"
}

Security Considerations

  1. Process isolation: Prevent host system contamination
  2. Capability dropping: Limit process privileges
  3. Resource limits: Prevent resource exhaustion
  4. Input validation: Validate all external inputs
  5. Output sanitization: Ensure safe output generation

Integration Points

CLI Interface

Main Entry Point (main_cli.py)

def osbuild_cli():
    """Main CLI entry point"""
    args = parse_arguments(sys.argv[1:])
    
    # Load manifest
    manifest = parse_manifest(args.manifest_path)
    
    # Validate manifest
    result = validate_manifest(manifest)
    if not result:
        show_validation(result, args.manifest_path)
        return 1
    
    # Execute build
    store = ObjectStore(args.cache)
    result = build_manifest(manifest, store)
    
    # Export results
    if args.export:
        for export_id in args.export:
            export(export_id, args.output_directory, store, manifest)
    
    return 0

Command Line Options

osbuild [OPTIONS] MANIFEST

Options:
  --cache DIR              Cache directory (default: .osbuild)
  --libdir DIR             Library directory (default: /usr/lib/osbuild)
  --cache-max-size SIZE    Maximum cache size
  --checkpoint ID          Stage to checkpoint
  --export ID              Object to export
  --output-directory DIR   Output directory
  --monitor NAME           Monitor to use
  --stage-timeout SECONDS  Stage timeout

API Interface

Python API (api.py)

def build_manifest(manifest, store, libdir=None):
    """Build manifest using object store"""
    # Load stages and assemblers
    # Execute pipeline
    # Return build result

REST API (Future)

@app.route('/api/v1/build', methods=['POST'])
def build_manifest_api():
    """REST API for manifest building"""
    manifest = request.json
    result = build_manifest(manifest, store)
    return jsonify(result)

External Tool Integration

Container Integration

# Docker
docker run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json

# Podman
podman run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json

CI/CD Integration

# GitHub Actions example
- name: Build OS Image
  run: |
    osbuild \
      --cache .osbuild \
      --output-directory outputs \
      manifest.json

Monitoring Integration

class Monitor:
    def __init__(self, name):
        self.name = name
    
    def stage_started(self, stage):
        """Called when stage starts"""
        pass
    
    def stage_completed(self, stage, result):
        """Called when stage completes"""
        pass
    
    def stage_failed(self, stage, error):
        """Called when stage fails"""
        pass

Advanced Features

Multi-Architecture Support

osbuild supports multiple architectures through stage variants:

{
  "stages": [
    {
      "name": "org.osbuild.debian.debootstrap",
      "options": {
        "suite": "bookworm",
        "arch": "arm64"
      }
    }
  ]
}

Parallel Execution

Stages can execute in parallel when dependencies allow:

def execute_parallel(stages, context):
    """Execute independent stages in parallel"""
    import concurrent.futures
    
    with concurrent.futures.ThreadPoolExecutor() as executor:
        futures = {
            executor.submit(execute_stage, stage, context): stage
            for stage in stages
        }
        
        for future in concurrent.futures.as_completed(futures):
            stage = futures[future]
            try:
                result = future.result()
                context.store_object(stage.id, result)
            except Exception as e:
                context.mark_failed(stage.id, str(e))

Checkpoint and Resume

def checkpoint_stage(stage, context):
    """Checkpoint stage execution"""
    if stage.checkpoint:
        # Save stage state
        checkpoint_path = os.path.join(context.store.path, f"{stage.id}.checkpoint")
        stage.save_checkpoint(checkpoint_path)
        
        # Store checkpoint metadata
        context.store.store_object(f"{stage.id}.checkpoint", {
            "type": "checkpoint",
            "stage_id": stage.id,
            "timestamp": time.time()
        })

Remote Execution

class RemoteExecutor:
    def __init__(self, host, user=None, key_file=None):
        self.host = host
        self.user = user
        self.key_file = key_file
    
    def execute_stage(self, stage, context):
        """Execute stage on remote host"""
        # Copy stage to remote host
        # Execute remotely
        # Retrieve results
        pass

Performance Characteristics

Build Time Optimization

  1. Parallel execution: Independent stages run concurrently
  2. Object caching: Reuse unchanged stage outputs
  3. Incremental builds: Skip stages with unchanged inputs
  4. Resource allocation: Optimize memory and CPU usage

Resource Usage

def optimize_resources(stages, available_memory, available_cpus):
    """Optimize resource allocation for stages"""
    # Calculate stage resource requirements
    # Allocate resources optimally
    # Prevent resource contention

Benchmarking

def benchmark_build(manifest, iterations=5):
    """Benchmark build performance"""
    times = []
    
    for i in range(iterations):
        start_time = time.time()
        result = build_manifest(manifest, store)
        end_time = time.time()
        
        times.append(end_time - start_time)
    
    return {
        "mean": statistics.mean(times),
        "median": statistics.median(times),
        "std": statistics.stdev(times),
        "min": min(times),
        "max": max(times)
    }

Conclusion

osbuild represents a sophisticated, production-ready build system for operating system artifacts. Its architecture emphasizes:

  1. Reproducibility: Consistent results through declarative manifests
  2. Extensibility: Pluggable stages and assemblers
  3. Performance: Optimized execution and caching
  4. Security: Process isolation and capability management
  5. Integration: Easy integration with existing toolchains

Key Strengths

  • Structured approach: Clear separation of concerns
  • Extensible architecture: Easy to add new stages and assemblers
  • Performance optimization: Efficient caching and parallel execution
  • Security focus: Built-in isolation and capability management
  • Distribution support: Works across multiple Linux distributions

Areas for Enhancement

  • Bootloader integration: Limited built-in bootloader support
  • Package management: Focus on RPM-based systems
  • Image formats: Limited output format support
  • Validation: Basic manifest validation capabilities

Use Cases

  1. Distribution building: Creating official distribution images
  2. Custom images: Building specialized OS images
  3. CI/CD pipelines: Automated image building
  4. Development: Testing and development environments
  5. Production deployment: Creating production-ready images

Complete Workflow Examples

Example 1: Basic Debian System Image

Manifest Definition

{
  "version": "2",
  "pipelines": [
    {
      "name": "build",
      "runner": "org.osbuild.linux",
      "stages": [
        {
          "name": "org.osbuild.debian.debootstrap",
          "options": {
            "suite": "bookworm",
            "mirror": "https://deb.debian.org/debian",
            "variant": "minbase"
          }
        },
        {
          "name": "org.osbuild.apt",
          "options": {
            "packages": ["sudo", "openssh-server", "systemd-sysv"]
          }
        },
        {
          "name": "org.osbuild.users",
          "options": {
            "users": {
              "debian": {
                "password": "$6$rounds=656000$...",
                "shell": "/bin/bash",
                "groups": ["sudo"]
              }
            }
          }
        }
      ]
    }
  ],
  "assembler": {
    "name": "org.osbuild.tar",
    "options": {
      "filename": "debian-basic.tar.gz",
      "compression": "gzip"
    }
  }
}

Complete Execution Flow

  1. Manifest Loading: Parse JSON manifest and validate schema
  2. Pipeline Construction: Build dependency graph for 3 stages
  3. Source Resolution: Download Debian packages and sources
  4. Stage Execution:
    • debootstrap: Create base Debian filesystem
    • apt: Install packages and dependencies
    • users: Create user accounts and groups
  5. Assembly: Create compressed tar archive
  6. Output: Generate debian-basic.tar.gz

Example 2: Bootable QEMU Disk Image

Manifest Definition

{
  "version": "2",
  "pipelines": [
    {
      "name": "build",
      "runner": "org.osbuild.linux",
      "stages": [
        {
          "name": "org.osbuild.debian.debootstrap",
          "options": {
            "suite": "bookworm",
            "variant": "minbase"
          }
        },
        {
          "name": "org.osbuild.apt",
          "options": {
            "packages": ["grub2-efi-amd64", "efibootmgr", "linux-image-amd64"]
          }
        },
        {
          "name": "org.osbuild.grub2",
          "options": {
            "root_fs_uuid": "6e4ff95f-f662-45ee-a82a-bdf44a2d0b75",
            "uefi": {
              "vendor": "debian",
              "unified": true
            }
          }
        }
      ]
    }
  ],
  "assembler": {
    "name": "org.osbuild.qemu",
    "options": {
      "format": "qcow2",
      "filename": "debian-bootable.qcow2",
      "size": "10G",
      "ptuuid": "12345678-1234-1234-1234-123456789012",
      "partitions": [
        {
          "name": "esp",
          "start": 1048576,
          "size": 268435456,
          "type": "fat32",
          "mountpoint": "/boot/efi"
        },
        {
          "name": "root",
          "start": 269484032,
          "size": 10485760000,
          "type": "ext4",
          "mountpoint": "/"
        }
      ]
    }
  }
}

Complete Execution Flow

  1. Manifest Loading: Parse JSON manifest and validate schema
  2. Pipeline Construction: Build dependency graph for 3 stages
  3. Source Resolution: Download Debian packages and GRUB components
  4. Stage Execution:
    • debootstrap: Create base Debian filesystem
    • apt: Install GRUB and kernel packages
    • grub2: Configure GRUB bootloader
  5. Assembly: Create QCOW2 disk image with partitions
  6. Output: Generate debian-bootable.qcow2

Example 3: OSTree-Based System

Manifest Definition

{
  "version": "2",
  "pipelines": [
    {
      "name": "build",
      "runner": "org.osbuild.linux",
      "stages": [
        {
          "name": "org.osbuild.debian.debootstrap",
          "options": {
            "suite": "bookworm",
            "variant": "minbase"
          }
        },
        {
          "name": "org.osbuild.apt",
          "options": {
            "packages": ["ostree", "systemd", "systemd-sysv"]
          }
        },
        {
          "name": "org.osbuild.ostree",
          "options": {
            "repository": "/var/lib/ostree/repo",
            "branch": "debian/bookworm/x86_64/standard"
          }
        }
      ]
    }
  ],
  "assembler": {
    "name": "org.osbuild.ostree.commit",
    "options": {
      "repository": "debian-ostree",
      "branch": "debian/bookworm/x86_64/standard"
    }
  }
}

Complete Execution Flow

  1. Manifest Loading: Parse JSON manifest and validate schema
  2. Pipeline Construction: Build dependency graph for 3 stages
  3. Source Resolution: Download Debian packages and OSTree
  4. Stage Execution:
    • debootstrap: Create base Debian filesystem
    • apt: Install OSTree and systemd packages
    • ostree: Configure OSTree repository
  5. Assembly: Create OSTree commit
  6. Output: Generate OSTree repository with commit

Conclusion

osbuild provides a solid foundation for building operating system images with a focus on reproducibility, performance, and extensibility. Its stage-based architecture makes it easy to customize and extend while maintaining consistency and reliability.

Key Strengths

  • Structured approach: Clear separation of concerns with stages and assemblers
  • Extensible architecture: Easy to add new stages and assemblers
  • Performance optimization: Efficient caching and parallel execution
  • Security focus: Built-in isolation and capability management
  • Distribution support: Works across multiple Linux distributions
  • Declarative manifests: JSON-based configuration with schema validation
  • Process isolation: Bubblewrap and systemd-nspawn integration
  • Object caching: Intelligent caching of stage outputs

Areas for Enhancement

  • Bootloader integration: Limited built-in bootloader support
  • Package management: Focus on RPM-based systems
  • Image formats: Limited output format support
  • Validation: Basic manifest validation capabilities
  • Template support: No built-in templating system
  • Cross-architecture: Limited architecture support

Complete Process Summary

osbuild implements a complete end-to-end image building pipeline that:

  1. Processes Manifests: JSON with schema validation
  2. Manages Stages: Atomic, composable building blocks
  3. Executes Builds: Isolated execution with security controls
  4. Handles Objects: Intelligent caching and storage
  5. Manages Devices: Loop devices and partition management
  6. Provides Assembly: Multiple output format support
  7. Ensures Security: Process isolation and capability dropping
  8. Generates Artifacts: Images, archives, and repositories

The system's architecture emphasizes reproducibility, security, and extensibility while maintaining performance through intelligent caching and isolated execution environments.