# osbuild Comprehensive Top-to-Bottom Analysis ## Overview osbuild is a pipeline-based build system for operating system artifacts that defines a universal pipeline description and execution engine. It produces artifacts like operating system images through a structured, stage-based approach that emphasizes reproducibility and extensibility. This document provides a complete top-to-bottom analysis of the entire osbuild process, from manifest parsing to final artifact generation. ## Table of Contents 1. [Complete Process Flow](#complete-process-flow) 2. [Core Architecture](#core-architecture) 3. [Pipeline System](#pipeline-system) 4. [Stage System](#stage-system) 5. [Assembler System](#assembler-system) 6. [Build Execution Engine](#build-execution-engine) 7. [Object Store and Caching](#object-store-and-caching) 8. [Security and Isolation](#security-and-isolation) 9. [External Tools and Dependencies](#external-tools-and-dependencies) 10. [Manifest Processing](#manifest-processing) 11. [Integration Points](#integration-points) 12. [Complete Workflow Examples](#complete-workflow-examples) ## Complete Process Flow ### End-to-End Workflow ``` 1. Manifest Loading → 2. Schema Validation → 3. Pipeline Construction → 4. Stage Execution → 5. Object Storage → 6. Assembly → 7. Artifact Generation → 8. Cleanup ``` ### Detailed Process Steps #### **Phase 1: Manifest Processing** 1. **File Loading**: Read JSON manifest file from disk or stdin 2. **Schema Validation**: Validate against JSON schemas for stages and assemblers 3. **Pipeline Construction**: Build stage dependency graph and execution order 4. **Source Resolution**: Download and prepare input sources 5. **Configuration**: Set up build environment and options #### **Phase 2: Build Environment Setup** 1. **BuildRoot Creation**: Initialize isolated build environment 2. **Resource Allocation**: Set up temporary directories and mounts 3. **Capability Management**: Configure process capabilities and security 4. **API Registration**: Set up communication endpoints 5. **Device Management**: Configure loop devices and partitions #### **Phase 3: Stage Execution** 1. **Dependency Resolution**: Execute stages in dependency order 2. **Input Processing**: Map input objects to stage requirements 3. **Environment Setup**: Mount filesystems and configure devices 4. **Stage Execution**: Run stage scripts in isolated environment 5. **Output Collection**: Store stage results in object store #### **Phase 4: Assembly and Output** 1. **Object Collection**: Gather all stage outputs 2. **Assembly Execution**: Run assembler to create final artifact 3. **Format Conversion**: Convert to requested output format 4. **Artifact Generation**: Create final image or archive 5. **Cleanup**: Remove temporary files and mounts ### Process Architecture Overview ``` osbuild CLI → Manifest Parser → Pipeline Builder → Stage Executor → Object Store → Assembler → Final Artifact ↓ ↓ ↓ ↓ ↓ ↓ ↓ Main Entry JSON Schema Dependency Graph Stage Runner Cache Output Gen Image/Archive ``` ## Core Architecture ### Design Philosophy osbuild follows a **declarative pipeline architecture** where: - **Manifests** define the complete build process as JSON - **Stages** are atomic, composable building blocks - **Assemblers** create final artifacts from stage outputs - **Pipelines** orchestrate stage execution and data flow ### Key Components ``` osbuild CLI → Manifest Parser → Pipeline Executor → Stage Runner → Assembler → Artifacts ↓ ↓ ↓ ↓ ↓ ↓ Main Entry JSON Schema Pipeline Builder Stage Exec Output Gen Final Files ``` ### Architecture Principles 1. **Stages are never broken, only deprecated** - Same manifest always produces same output 2. **Explicit over implicit** - No reliance on tree state 3. **Pipeline independence** - Tree is empty at beginning of each pipeline 4. **Machine-generated manifests** - No convenience functions for manual creation 5. **Confined build environment** - Security against accidental misuse 6. **Distribution compatibility** - Python 3.6+ support ## Pipeline System ### Pipeline Structure A pipeline is a directed acyclic graph (DAG) of stages: ```json { "pipeline": { "build": { "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { ... } }, { "name": "org.osbuild.apt", "options": { ... } } ] }, "assembler": { "name": "org.osbuild.qemu", "options": { ... } } } } ``` ### Pipeline Execution Model ```python class Pipeline: def __init__(self, info, source_options, build, base, options, source_epoch): self.info = info self.sources = source_options self.build = build self.base = base self.options = options self.source_epoch = source_epoch self.checkpoint = False self.inputs = {} self.devices = {} self.mounts = {} ``` ### Pipeline Lifecycle 1. **Initialization**: Load manifest and validate schema 2. **Preparation**: Set up build environment and dependencies 3. **Execution**: Run stages in dependency order 4. **Assembly**: Create final artifacts from stage outputs 5. **Cleanup**: Remove temporary files and resources ## Stage System ### Stage Architecture Stages are the core building blocks of osbuild: ```python class Stage: def __init__(self, info, source_options, build, base, options, source_epoch): self.info = info # Stage metadata self.sources = source_options # Input sources self.build = build # Build configuration self.base = base # Base tree self.options = options # Stage-specific options self.source_epoch = source_epoch # Source timestamp self.checkpoint = False # Checkpoint flag self.inputs = {} # Input objects self.devices = {} # Device configurations self.mounts = {} # Mount configurations ``` ### Stage Types #### **System Construction Stages** - `org.osbuild.debian.debootstrap`: Debian base system creation - `org.osbuild.rpm`: RPM package installation - `org.osbuild.ostree`: OSTree repository management - `org.osbuild.apt`: Debian package management #### **Filesystem Stages** - `org.osbuild.overlay`: File/directory copying - `org.osbuild.mkdir`: Directory creation - `org.osbuild.copy`: File copying operations - `org.osbuild.symlink`: Symbolic link creation #### **Configuration Stages** - `org.osbuild.users`: User account management - `org.osbuild.fstab`: Filesystem table configuration - `org.osbuild.locale`: Locale configuration - `org.osbuild.timezone`: Timezone setup #### **Bootloader Stages** - `org.osbuild.grub2`: GRUB2 bootloader configuration - `org.osbuild.bootupd`: Modern bootloader management - `org.osbuild.zipl`: S390x bootloader #### **Image Creation Stages** - `org.osbuild.image`: Raw image creation - `org.osbuild.qemu`: QEMU image assembly - `org.osbuild.tar`: Archive creation - `org.osbuild.oci-archive`: OCI container images ### Stage Implementation Pattern Each stage follows a consistent pattern: ```python #!/usr/bin/python3 import os import sys import osbuild.api def main(tree, options): # Stage-specific logic # Process options # Manipulate filesystem tree return 0 if __name__ == '__main__': args = osbuild.api.arguments() ret = main(args["tree"], args["options"]) sys.exit(ret) ``` ### Key Stages Deep Dive #### **GRUB2 Stage** (`stages/org.osbuild.grub2.iso`) **Purpose**: Configure GRUB2 bootloader for ISO images **Implementation**: Template-based GRUB configuration generation ```python GRUB2_EFI_CFG_TEMPLATE = """$defaultentry function load_video { insmod efi_gop insmod efi_uga insmod video_bochs insmod video_cirrus insmod all_video } load_video set gfxpayload=keep insmod gzio insmod part_gpt insmod ext2 set timeout=${timeout} search --no-floppy --set=root -l '${isolabel}' menuentry 'Install ${product} ${version}' --class fedora --class gnu-linux --class gnu --class os { linux ${kernelpath} ${root} quiet initrd ${initrdpath} } """ ``` **External Tools Used**: - `shim*.efi`: Secure boot components - `grub*.efi`: GRUB bootloader binaries - `unicode.pf2`: GRUB font files #### **bootupd Stage** (`stages/org.osbuild.bootupd.gen-metadata`) **Purpose**: Generate bootupd update metadata **Implementation**: Chroot execution of bootupctl ```python def main(tree): with MountGuard() as mounter: # Mount essential directories for source in ("/dev", "/sys", "/proc"): target = os.path.join(tree, source.lstrip("/")) os.makedirs(target, exist_ok=True) mounter.mount(source, target, permissions=MountPermissions.READ_ONLY) # Execute bootupctl in chroot cmd = ['chroot', tree, '/usr/bin/bootupctl', 'backend', 'generate-update-metadata', '/'] subprocess.run(cmd, check=True) return 0 ``` **External Tools Used**: - `chroot`: Filesystem isolation - `bootupctl`: bootupd control utility - `mount`: Directory mounting #### **QEMU Assembler** (`assemblers/org.osbuild.qemu`) **Purpose**: Create bootable disk images **Implementation**: Comprehensive disk image creation with bootloader support ```python def main(tree, options): # Create image file # Partition using sfdisk # Format filesystems # Copy tree contents # Install bootloader # Convert to requested format ``` **External Tools Used**: - `truncate`: File size management - `sfdisk`: Partition table creation - `mkfs.ext4`, `mkfs.xfs`: Filesystem formatting - `mount`, `umount`: Partition mounting - `grub2-mkimage`: GRUB image creation - `qemu-img`: Image format conversion ## Assembler System ### Assembler Types #### **Image Assemblers** - `org.osbuild.qemu`: Bootable disk images (raw, qcow2, vmdk, etc.) - `org.osbuild.rawfs`: Raw filesystem images - `org.osbuild.tar`: Compressed archives - `org.osbuild.oci-archive`: OCI container images #### **Specialized Assemblers** - `org.osbuild.ostree.commit`: OSTree repository commits - `org.osbuild.error`: Error reporting - `org.osbuild.noop`: No-operation assembler ### Assembler Implementation ```python class QemuAssembler: def __init__(self, options): self.options = options self.format = options["format"] self.filename = options["filename"] self.size = options["size"] self.bootloader = options.get("bootloader", {}) def assemble(self, tree): # Create image file # Set up partitions # Copy filesystem contents # Install bootloader # Convert to target format return result ``` ## External Tools and Dependencies ### Core System Tools #### **Filesystem Management** - `parted`: Partition table management - `sfdisk`: Scriptable partitioning - `mkfs.ext4`, `mkfs.xfs`, `mkfs.fat`: Filesystem creation - `mount`, `umount`: Filesystem mounting - `losetup`: Loop device management - `truncate`: File size manipulation #### **Package Management** - `rpm`: RPM package management - `yum`, `dnf`: Package manager frontends - `apt`, `apt-get`: Debian package management - `pacman`: Arch Linux package management #### **Bootloader Tools** - `grub2-install`: GRUB2 installation - `grub2-mkimage`: GRUB2 image creation - `bootupctl`: bootupd control utility - `zipl`: S390x bootloader #### **Image and Archive Tools** - `qemu-img`: Image format conversion - `tar`: Archive creation and extraction - `gzip`, `bzip2`, `xz`: Compression - `skopeo`: Container image operations #### **System Tools** - `bubblewrap`: Process isolation - `systemd-nspawn`: Container management - `chroot`: Filesystem isolation - `curl`: Network file transfer ### Build System Dependencies #### **Python Dependencies** (`requirements.txt`) ``` jsonschema ``` #### **System Dependencies** - `python >= 3.6`: Python runtime - `bubblewrap >= 0.4.0`: Process isolation - `bash >= 5.0`: Shell execution - `coreutils >= 8.31`: Core utilities - `curl >= 7.68`: Network operations - `qemu-img >= 4.2.0`: Image manipulation - `rpm >= 4.15`: RPM package management - `tar >= 1.32`: Archive operations - `util-linux >= 235`: System utilities - `skopeo`: Container operations - `python3-librepo`: Repository access ### External Tool Integration Points #### **Command Execution** ```python def run_command(cmd, cwd=None, env=None): """Execute external command with proper environment""" result = subprocess.run( cmd, cwd=cwd, env=env, capture_output=True, text=True, check=True ) return result ``` #### **Chroot Execution** ```python def chroot_execute(tree, cmd): """Execute command in chroot environment""" chroot_cmd = ['chroot', tree] + cmd return run_command(chroot_cmd) ``` #### **Mount Management** ```python class MountGuard: """Context manager for mount operations""" def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.cleanup() def mount(self, source, target, permissions=None): # Mount with proper permissions pass ``` ## Manifest Processing ### JSON Schema Validation osbuild uses JSON Schema for manifest validation: ```python def validate_manifest(manifest, schema): """Validate manifest against JSON schema""" validator = jsonschema.Draft7Validator(schema) errors = list(validator.iter_errors(manifest)) return ValidationResult(errors) ``` ### Manifest Structure ```json { "version": "2", "pipelines": [ { "name": "build", "runner": "org.osbuild.linux", "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { "suite": "bookworm", "mirror": "https://deb.debian.org/debian", "variant": "minbase" } } ] } ], "assembler": { "name": "org.osbuild.qemu", "options": { "format": "qcow2", "filename": "debian.qcow2", "size": "10G", "ptuuid": "12345678-1234-1234-1234-123456789012" } } } ``` ### Template Processing osbuild supports manifest templating through external tools: ```bash # Example with jq for dynamic manifest generation jq --arg size "$IMAGE_SIZE" --arg format "$IMAGE_FORMAT" ' .assembler.options.size = $size | .assembler.options.format = $format ' template.json > manifest.json ``` ## Build Execution Engine ### Complete Build Execution System #### **BuildRoot Architecture** ```python class BuildRoot(contextlib.AbstractContextManager): """Build Root This class implements a context-manager that maintains a root file-system for contained environments. When entering the context, the required file-system setup is performed, and it is automatically torn down when exiting. """ def __init__(self, root, runner, libdir, var, *, rundir="/run/osbuild"): self._exitstack = None self._rootdir = root self._rundir = rundir self._vardir = var self._libdir = libdir self._runner = runner self._apis = [] self.dev = None self.var = None self.proc = None self.tmp = None self.mount_boot = True self.caps = None ``` #### **BuildRoot Setup Process** ```python def __enter__(self): self._exitstack = contextlib.ExitStack() with self._exitstack: # Create temporary directories dev = tempfile.TemporaryDirectory(prefix="osbuild-dev-", dir=self._rundir) self.dev = self._exitstack.enter_context(dev) tmp = tempfile.TemporaryDirectory(prefix="osbuild-tmp-", dir=self._vardir) self.tmp = self._exitstack.enter_context(tmp) # Set up device nodes self._mknod(self.dev, "full", 0o666, 1, 7) self._mknod(self.dev, "null", 0o666, 1, 3) self._mknod(self.dev, "random", 0o666, 1, 8) self._mknod(self.dev, "urandom", 0o666, 1, 9) self._mknod(self.dev, "tty", 0o666, 5, 0) self._mknod(self.dev, "zero", 0o666, 1, 5) # Mount tmpfs for /dev subprocess.run(["mount", "-t", "tmpfs", "-o", "nosuid", "none", self.dev], check=True) self._exitstack.callback(lambda: subprocess.run(["umount", "--lazy", self.dev], check=True)) # Prepare all registered API endpoints for api in self._apis: self._exitstack.enter_context(api) self._exitstack = self._exitstack.pop_all() return self ``` #### **Stage Execution Process** ```python def execute_stage(stage, context): """Execute a single stage""" try: # 1. Prepare stage environment stage.setup(context) # 2. Set up buildroot with buildroot.BuildRoot(build_tree, runner.path, libdir, store.tmp) as build_root: # 3. Configure capabilities build_root.caps = DEFAULT_CAPABILITIES | stage.info.caps # 4. Set up mounts and devices for name, mount in stage.mounts.items(): mount_data = mount_manager.mount(mount) mounts[name] = mount_data # 5. Prepare arguments args = { "tree": "/run/osbuild/tree", "paths": { "devices": devices_mapped, "inputs": inputs_mapped, "mounts": mounts_mounted, }, "devices": devices, "inputs": inputs, "mounts": mounts, } # 6. Execute stage result = build_root.run([f"/run/osbuild/bin/{stage.name}"], monitor, timeout=timeout, binds=binds, readonly_binds=ro_binds, extra_env=extra_env, debug_shell=debug_shell) # 7. Process output context.store_object(stage.id, result) return result except Exception as e: # Handle errors context.mark_failed(stage.id, str(e)) raise ``` #### **Command Execution in BuildRoot** ```python def run(self, argv, monitor, timeout=None, binds=None, readonly_binds=None, extra_env=None, debug_shell=False): """Runs a command in the buildroot. Takes the command and arguments, as well as bind mounts to mirror in the build-root for this command. """ if not self._exitstack: raise RuntimeError("No active context") stage_name = os.path.basename(argv[0]) mounts = [] # Import directories from the caller-provided root imports = ["usr"] if self.mount_boot: imports.append("boot") # Build bubblewrap command bwrap_cmd = [ "bwrap", "--dev-bind", "/", "/", "--proc", self.proc, "--dev", self.dev, "--var", self.var, "--tmp", self.tmp, "--chdir", "/", ] # Add bind mounts for bind in binds or []: bwrap_cmd.extend(["--bind"] + bind.split(":", 1)) # Add readonly bind mounts for bind in readonly_binds or []: bwrap_cmd.extend(["--ro-bind"] + bind.split(":", 1)) # Add environment variables if extra_env: for key, value in extra_env.items(): bwrap_cmd.extend(["--setenv", key, value]) # Add command bwrap_cmd.extend(argv) # Execute with bubblewrap result = subprocess.run(bwrap_cmd, capture_output=True, text=True, timeout=timeout) return CompletedBuild(result, result.stdout + result.stderr) ``` #### **Process Isolation and Security** ```python DEFAULT_CAPABILITIES = { "CAP_AUDIT_WRITE", "CAP_CHOWN", "CAP_DAC_OVERRIDE", "CAP_DAC_READ_SEARCH", "CAP_FOWNER", "CAP_FSETID", "CAP_IPC_LOCK", "CAP_LINUX_IMMUTABLE", "CAP_MAC_OVERRIDE", "CAP_MKNOD", "CAP_NET_BIND_SERVICE", "CAP_SETFCAP", "CAP_SETGID", "CAP_SETPCAP", "CAP_SETUID", "CAP_SYS_ADMIN", "CAP_SYS_CHROOT", "CAP_SYS_NICE", "CAP_SYS_RESOURCE" } def drop_capabilities(caps_to_keep): """Drop all capabilities except those specified""" import ctypes from ctypes import c_int, c_uint libc = ctypes.CDLL("libc.so.6") # Get current capabilities caps = c_uint() libc.cap_get_proc(ctypes.byref(caps)) # Drop unwanted capabilities for cap in ALL_CAPABILITIES - caps_to_keep: libc.cap_drop(caps, cap) # Set new capabilities libc.cap_set_proc(ctypes.byref(caps)) ``` ### Build Process Flow 1. **Manifest Loading**: Parse and validate JSON manifest 2. **Pipeline Construction**: Build stage dependency graph 3. **Source Resolution**: Download and prepare input sources 4. **Stage Execution**: Run stages in dependency order 5. **Assembly**: Create final artifacts from stage outputs 6. **Output**: Export requested objects ### Build Environment ```python class BuildRoot: def __init__(self, path, runner): self.path = path self.runner = runner self.mounts = [] self.devices = [] def setup(self): """Set up build environment""" # Create build directory # Set up isolation # Mount required directories def cleanup(self): """Clean up build environment""" # Unmount directories # Remove temporary files ``` ### Stage Execution ```python def execute_stage(stage, context): """Execute a single stage""" try: # Prepare stage environment stage.setup(context) # Execute stage result = stage.run(context) # Process output context.store_object(stage.id, result) return result except Exception as e: # Handle errors context.mark_failed(stage.id, str(e)) raise ``` ## Object Store and Caching ### Object Store Architecture ```python class ObjectStore: def __init__(self, path): self.path = path self.objects = {} def store_object(self, obj_id, obj): """Store object in object store""" obj_path = os.path.join(self.path, obj_id) os.makedirs(obj_path, exist_ok=True) # Store object metadata and data with open(os.path.join(obj_path, "meta.json"), "w") as f: json.dump(obj.meta, f) obj.export(obj_path) def get_object(self, obj_id): """Retrieve object from store""" if obj_id in self.objects: return self.objects[obj_id] obj_path = os.path.join(self.path, obj_id) if os.path.exists(obj_path): obj = self.load_object(obj_path) self.objects[obj_id] = obj return obj return None ``` ### Caching Strategy 1. **Object-level caching**: Store stage outputs by ID 2. **Dependency tracking**: Reuse objects when dependencies haven't changed 3. **Incremental builds**: Skip stages with unchanged inputs 4. **Checkpoint support**: Save intermediate results for debugging ### Cache Management ```python def manage_cache(store, max_size=None): """Manage object store cache size""" if max_size is None: return # Calculate current cache size current_size = calculate_cache_size(store.path) if current_size > max_size: # Remove least recently used objects remove_lru_objects(store, current_size - max_size) ``` ## Security and Isolation ### Process Isolation osbuild uses multiple isolation mechanisms: #### **Bubblewrap** ```python def run_isolated(cmd, cwd=None, env=None): """Run command with bubblewrap isolation""" bwrap_cmd = [ "bwrap", "--dev-bind", "/", "/", "--proc", "/proc", "--dev", "/dev", "--chdir", cwd or "/" ] + cmd return run_command(bwrap_cmd, env=env) ``` #### **Systemd-nspawn** ```python def run_containerized(cmd, tree, env=None): """Run command in systemd-nspawn container""" nspawn_cmd = [ "systemd-nspawn", "--directory", tree, "--bind", "/dev", "/dev", "--bind", "/proc", "/proc", "--bind", "/sys", "/sys" ] + cmd return run_command(nspawn_cmd, env=env) ``` ### Capability Management ```python DEFAULT_CAPABILITIES = { "CAP_AUDIT_WRITE", "CAP_CHOWN", "CAP_DAC_OVERRIDE", "CAP_DAC_READ_SEARCH", "CAP_FOWNER", "CAP_FSETID", "CAP_IPC_LOCK", "CAP_LINUX_IMMUTABLE", "CAP_MAC_OVERRIDE", "CAP_MKNOD", "CAP_NET_BIND_SERVICE", "CAP_SETFCAP", "CAP_SETGID", "CAP_SETPCAP", "CAP_SETUID", "CAP_SYS_ADMIN", "CAP_SYS_CHROOT", "CAP_SYS_NICE", "CAP_SYS_RESOURCE" } ``` ### Security Considerations 1. **Process isolation**: Prevent host system contamination 2. **Capability dropping**: Limit process privileges 3. **Resource limits**: Prevent resource exhaustion 4. **Input validation**: Validate all external inputs 5. **Output sanitization**: Ensure safe output generation ## Integration Points ### CLI Interface #### **Main Entry Point** (`main_cli.py`) ```python def osbuild_cli(): """Main CLI entry point""" args = parse_arguments(sys.argv[1:]) # Load manifest manifest = parse_manifest(args.manifest_path) # Validate manifest result = validate_manifest(manifest) if not result: show_validation(result, args.manifest_path) return 1 # Execute build store = ObjectStore(args.cache) result = build_manifest(manifest, store) # Export results if args.export: for export_id in args.export: export(export_id, args.output_directory, store, manifest) return 0 ``` #### **Command Line Options** ```bash osbuild [OPTIONS] MANIFEST Options: --cache DIR Cache directory (default: .osbuild) --libdir DIR Library directory (default: /usr/lib/osbuild) --cache-max-size SIZE Maximum cache size --checkpoint ID Stage to checkpoint --export ID Object to export --output-directory DIR Output directory --monitor NAME Monitor to use --stage-timeout SECONDS Stage timeout ``` ### API Interface #### **Python API** (`api.py`) ```python def build_manifest(manifest, store, libdir=None): """Build manifest using object store""" # Load stages and assemblers # Execute pipeline # Return build result ``` #### **REST API** (Future) ```python @app.route('/api/v1/build', methods=['POST']) def build_manifest_api(): """REST API for manifest building""" manifest = request.json result = build_manifest(manifest, store) return jsonify(result) ``` ### External Tool Integration #### **Container Integration** ```bash # Docker docker run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json # Podman podman run --rm -v $(pwd):/workspace osbuild/osbuild manifest.json ``` #### **CI/CD Integration** ```yaml # GitHub Actions example - name: Build OS Image run: | osbuild \ --cache .osbuild \ --output-directory outputs \ manifest.json ``` #### **Monitoring Integration** ```python class Monitor: def __init__(self, name): self.name = name def stage_started(self, stage): """Called when stage starts""" pass def stage_completed(self, stage, result): """Called when stage completes""" pass def stage_failed(self, stage, error): """Called when stage fails""" pass ``` ## Advanced Features ### Multi-Architecture Support osbuild supports multiple architectures through stage variants: ```json { "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { "suite": "bookworm", "arch": "arm64" } } ] } ``` ### Parallel Execution Stages can execute in parallel when dependencies allow: ```python def execute_parallel(stages, context): """Execute independent stages in parallel""" import concurrent.futures with concurrent.futures.ThreadPoolExecutor() as executor: futures = { executor.submit(execute_stage, stage, context): stage for stage in stages } for future in concurrent.futures.as_completed(futures): stage = futures[future] try: result = future.result() context.store_object(stage.id, result) except Exception as e: context.mark_failed(stage.id, str(e)) ``` ### Checkpoint and Resume ```python def checkpoint_stage(stage, context): """Checkpoint stage execution""" if stage.checkpoint: # Save stage state checkpoint_path = os.path.join(context.store.path, f"{stage.id}.checkpoint") stage.save_checkpoint(checkpoint_path) # Store checkpoint metadata context.store.store_object(f"{stage.id}.checkpoint", { "type": "checkpoint", "stage_id": stage.id, "timestamp": time.time() }) ``` ### Remote Execution ```python class RemoteExecutor: def __init__(self, host, user=None, key_file=None): self.host = host self.user = user self.key_file = key_file def execute_stage(self, stage, context): """Execute stage on remote host""" # Copy stage to remote host # Execute remotely # Retrieve results pass ``` ## Performance Characteristics ### Build Time Optimization 1. **Parallel execution**: Independent stages run concurrently 2. **Object caching**: Reuse unchanged stage outputs 3. **Incremental builds**: Skip stages with unchanged inputs 4. **Resource allocation**: Optimize memory and CPU usage ### Resource Usage ```python def optimize_resources(stages, available_memory, available_cpus): """Optimize resource allocation for stages""" # Calculate stage resource requirements # Allocate resources optimally # Prevent resource contention ``` ### Benchmarking ```python def benchmark_build(manifest, iterations=5): """Benchmark build performance""" times = [] for i in range(iterations): start_time = time.time() result = build_manifest(manifest, store) end_time = time.time() times.append(end_time - start_time) return { "mean": statistics.mean(times), "median": statistics.median(times), "std": statistics.stdev(times), "min": min(times), "max": max(times) } ``` ## Conclusion osbuild represents a sophisticated, production-ready build system for operating system artifacts. Its architecture emphasizes: 1. **Reproducibility**: Consistent results through declarative manifests 2. **Extensibility**: Pluggable stages and assemblers 3. **Performance**: Optimized execution and caching 4. **Security**: Process isolation and capability management 5. **Integration**: Easy integration with existing toolchains ### Key Strengths - **Structured approach**: Clear separation of concerns - **Extensible architecture**: Easy to add new stages and assemblers - **Performance optimization**: Efficient caching and parallel execution - **Security focus**: Built-in isolation and capability management - **Distribution support**: Works across multiple Linux distributions ### Areas for Enhancement - **Bootloader integration**: Limited built-in bootloader support - **Package management**: Focus on RPM-based systems - **Image formats**: Limited output format support - **Validation**: Basic manifest validation capabilities ### Use Cases 1. **Distribution building**: Creating official distribution images 2. **Custom images**: Building specialized OS images 3. **CI/CD pipelines**: Automated image building 4. **Development**: Testing and development environments 5. **Production deployment**: Creating production-ready images ## Complete Workflow Examples ### Example 1: Basic Debian System Image #### **Manifest Definition** ```json { "version": "2", "pipelines": [ { "name": "build", "runner": "org.osbuild.linux", "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { "suite": "bookworm", "mirror": "https://deb.debian.org/debian", "variant": "minbase" } }, { "name": "org.osbuild.apt", "options": { "packages": ["sudo", "openssh-server", "systemd-sysv"] } }, { "name": "org.osbuild.users", "options": { "users": { "debian": { "password": "$6$rounds=656000$...", "shell": "/bin/bash", "groups": ["sudo"] } } } } ] } ], "assembler": { "name": "org.osbuild.tar", "options": { "filename": "debian-basic.tar.gz", "compression": "gzip" } } } ``` #### **Complete Execution Flow** 1. **Manifest Loading**: Parse JSON manifest and validate schema 2. **Pipeline Construction**: Build dependency graph for 3 stages 3. **Source Resolution**: Download Debian packages and sources 4. **Stage Execution**: - `debootstrap`: Create base Debian filesystem - `apt`: Install packages and dependencies - `users`: Create user accounts and groups 5. **Assembly**: Create compressed tar archive 6. **Output**: Generate `debian-basic.tar.gz` ### Example 2: Bootable QEMU Disk Image #### **Manifest Definition** ```json { "version": "2", "pipelines": [ { "name": "build", "runner": "org.osbuild.linux", "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { "suite": "bookworm", "variant": "minbase" } }, { "name": "org.osbuild.apt", "options": { "packages": ["grub2-efi-amd64", "efibootmgr", "linux-image-amd64"] } }, { "name": "org.osbuild.grub2", "options": { "root_fs_uuid": "6e4ff95f-f662-45ee-a82a-bdf44a2d0b75", "uefi": { "vendor": "debian", "unified": true } } } ] } ], "assembler": { "name": "org.osbuild.qemu", "options": { "format": "qcow2", "filename": "debian-bootable.qcow2", "size": "10G", "ptuuid": "12345678-1234-1234-1234-123456789012", "partitions": [ { "name": "esp", "start": 1048576, "size": 268435456, "type": "fat32", "mountpoint": "/boot/efi" }, { "name": "root", "start": 269484032, "size": 10485760000, "type": "ext4", "mountpoint": "/" } ] } } } ``` #### **Complete Execution Flow** 1. **Manifest Loading**: Parse JSON manifest and validate schema 2. **Pipeline Construction**: Build dependency graph for 3 stages 3. **Source Resolution**: Download Debian packages and GRUB components 4. **Stage Execution**: - `debootstrap`: Create base Debian filesystem - `apt`: Install GRUB and kernel packages - `grub2`: Configure GRUB bootloader 5. **Assembly**: Create QCOW2 disk image with partitions 6. **Output**: Generate `debian-bootable.qcow2` ### Example 3: OSTree-Based System #### **Manifest Definition** ```json { "version": "2", "pipelines": [ { "name": "build", "runner": "org.osbuild.linux", "stages": [ { "name": "org.osbuild.debian.debootstrap", "options": { "suite": "bookworm", "variant": "minbase" } }, { "name": "org.osbuild.apt", "options": { "packages": ["ostree", "systemd", "systemd-sysv"] } }, { "name": "org.osbuild.ostree", "options": { "repository": "/var/lib/ostree/repo", "branch": "debian/bookworm/x86_64/standard" } } ] } ], "assembler": { "name": "org.osbuild.ostree.commit", "options": { "repository": "debian-ostree", "branch": "debian/bookworm/x86_64/standard" } } } ``` #### **Complete Execution Flow** 1. **Manifest Loading**: Parse JSON manifest and validate schema 2. **Pipeline Construction**: Build dependency graph for 3 stages 3. **Source Resolution**: Download Debian packages and OSTree 4. **Stage Execution**: - `debootstrap`: Create base Debian filesystem - `apt`: Install OSTree and systemd packages - `ostree`: Configure OSTree repository 5. **Assembly**: Create OSTree commit 6. **Output**: Generate OSTree repository with commit ## Conclusion osbuild provides a solid foundation for building operating system images with a focus on reproducibility, performance, and extensibility. Its stage-based architecture makes it easy to customize and extend while maintaining consistency and reliability. ### Key Strengths - **Structured approach**: Clear separation of concerns with stages and assemblers - **Extensible architecture**: Easy to add new stages and assemblers - **Performance optimization**: Efficient caching and parallel execution - **Security focus**: Built-in isolation and capability management - **Distribution support**: Works across multiple Linux distributions - **Declarative manifests**: JSON-based configuration with schema validation - **Process isolation**: Bubblewrap and systemd-nspawn integration - **Object caching**: Intelligent caching of stage outputs ### Areas for Enhancement - **Bootloader integration**: Limited built-in bootloader support - **Package management**: Focus on RPM-based systems - **Image formats**: Limited output format support - **Validation**: Basic manifest validation capabilities - **Template support**: No built-in templating system - **Cross-architecture**: Limited architecture support ### Complete Process Summary osbuild implements a **complete end-to-end image building pipeline** that: 1. **Processes Manifests**: JSON with schema validation 2. **Manages Stages**: Atomic, composable building blocks 3. **Executes Builds**: Isolated execution with security controls 4. **Handles Objects**: Intelligent caching and storage 5. **Manages Devices**: Loop devices and partition management 6. **Provides Assembly**: Multiple output format support 7. **Ensures Security**: Process isolation and capability dropping 8. **Generates Artifacts**: Images, archives, and repositories The system's architecture emphasizes **reproducibility**, **security**, and **extensibility** while maintaining **performance** through intelligent caching and isolated execution environments.