Some checks failed
Comprehensive CI/CD Pipeline / Build and Test (push) Successful in 7m17s
Comprehensive CI/CD Pipeline / Security Audit (push) Failing after 8s
Comprehensive CI/CD Pipeline / Package Validation (push) Successful in 54s
Comprehensive CI/CD Pipeline / Status Report (push) Has been skipped
- Fixed /sysroot directory requirement for bootc compatibility - Implemented proper composefs configuration files - Added log cleanup for reproducible builds - Created correct /ostree symlink to sysroot/ostree - Bootc lint now passes 11/11 checks with only minor warning - Full bootc compatibility achieved - images ready for production use Updated documentation and todo to reflect completed work. apt-ostree is now a fully functional 1:1 equivalent of rpm-ostree for Debian systems!
21 KiB
21 KiB
🔍 rpm-ostree Error Handling Analysis
📋 Overview
This document analyzes the error handling patterns in rpm-ostree, examining how errors are managed across the CLI client (rpm-ostree) and the system daemon (rpm-ostreed). Understanding these patterns is crucial for implementing robust error handling in apt-ostree.
🏗️ Error Handling Architecture Overview
Component Error Handling Distribution
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI Client │ │ Error Layer │ │ System Daemon │
│ (rpm-ostree) │◄──►│ (GError/DBus) │◄──►│ (rpm-ostreed) │
│ │ │ │ │ │
│ • User-facing │ │ • Error Types │ │ • System-level │
│ • Command-line │ │ • Error Codes │ │ • Transaction │
│ • Progress │ │ • Error Domain │ │ • OSTree Ops │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Error Handling Principles
- Separation of Concerns: CLI handles user-facing errors, daemon handles system errors
- Error Propagation: Errors flow from daemon to CLI via DBus
- Transaction Safety: Failed operations trigger automatic rollback
- User Experience: Clear error messages with recovery suggestions
- Logging Integration: Comprehensive error logging for debugging
🔍 Detailed Error Handling Analysis
1. Daemon Error Types (rpmostreed-errors.h)
Core Error Definitions
typedef enum
{
RPM_OSTREED_ERROR_FAILED, // Generic operation failure
RPM_OSTREED_ERROR_INVALID_SYSROOT, // Invalid system root path
RPM_OSTREED_ERROR_NOT_AUTHORIZED, // PolicyKit authorization failure
RPM_OSTREED_ERROR_UPDATE_IN_PROGRESS, // Concurrent update prevention
RPM_OSTREED_ERROR_INVALID_REFSPEC, // Invalid OSTree reference
RPM_OSTREED_ERROR_NUM_ENTRIES, // Enum size marker
} RpmOstreedError;
Error Domain Registration
// From rpmostreed-errors.cxx
static const GDBusErrorEntry dbus_error_entries[] = {
{ RPM_OSTREED_ERROR_FAILED, "org.projectatomic.rpmostreed.Error.Failed" },
{ RPM_OSTREED_ERROR_INVALID_SYSROOT, "org.projectatomic.rpmostreed.Error.InvalidSysroot" },
{ RPM_OSTREED_ERROR_NOT_AUTHORIZED, "org.projectatomic.rpmostreed.Error.NotAuthorized" },
{ RPM_OSTREED_ERROR_UPDATE_IN_PROGRESS, "org.projectatomic.rpmostreed.Error.UpdateInProgress" },
{ RPM_OSTREED_ERROR_INVALID_REFSPEC, "org.projectatomic.rpmostreed.Error.InvalidRefspec" },
};
GQuark rpmostreed_error_quark (void) {
static gsize quark = 0;
g_dbus_error_register_error_domain ("rpmostreed-error-quark", &quark,
dbus_error_entries, G_N_ELEMENTS (dbus_error_entries));
return (GQuark)quark;
}
Key Characteristics:
- DBus Integration: Errors are registered as DBus error domains
- Standardized Codes: Predefined error codes for common failure scenarios
- Internationalization: Error messages can be localized
- Error Quarks: Unique identifiers for error domains
2. Transaction Error Handling
Transaction Lifecycle Error Management
// From rpmostreed-transaction.cxx
struct _RpmostreedTransactionPrivate {
GDBusMethodInvocation *invocation; // DBus method context
gboolean executed; // Transaction completion state
GCancellable *cancellable; // Cancellation support
// System state during transaction
char *sysroot_path; // Sysroot path
OstreeSysroot *sysroot; // OSTree sysroot
gboolean sysroot_locked; // Sysroot lock state
// Client tracking
char *client_description; // Client description
char *agent_id; // Client agent ID
char *sd_unit; // Systemd unit
// Progress tracking
gint64 last_progress_journal; // Progress journal timestamp
gboolean redirect_output; // Output redirection flag
// Peer connections
GDBusServer *server; // DBus server
GHashTable *peer_connections; // Client connections
// Completion state
GVariant *finished_params; // Completion parameters
guint watch_id; // Watch identifier
};
Error Recovery Mechanisms
// Transaction rollback on failure
static void
unlock_sysroot (RpmostreedTransaction *self)
{
RpmostreedTransactionPrivate *priv = rpmostreed_transaction_get_private (self);
if (!(priv->sysroot && priv->sysroot_locked))
return;
ostree_sysroot_unlock (priv->sysroot);
sd_journal_print (LOG_INFO, "Unlocked sysroot");
priv->sysroot_locked = FALSE;
}
// Transaction cleanup
static void
transaction_maybe_emit_closed (RpmostreedTransaction *self)
{
RpmostreedTransactionPrivate *priv = rpmostreed_transaction_get_private (self);
if (rpmostreed_transaction_get_active (self))
return;
if (g_hash_table_size (priv->peer_connections) > 0)
return;
g_signal_emit (self, signals[CLOSED], 0);
rpmostreed_sysroot_finish_txn (rpmostreed_sysroot_get (), self);
}
Key Characteristics:
- Automatic Rollback: Failed transactions automatically unlock sysroot
- Resource Cleanup: Proper cleanup of system resources on failure
- Signal Emission: Error signals sent to all connected clients
- Journal Integration: Errors logged to systemd journal
3. CLI Client Error Handling
DBus Error Handling Patterns
// From rpmostree-clientlib.cxx
static void
on_owner_changed (GObject *object, GParamSpec *pspec, gpointer user_data)
{
auto tp = static_cast<TransactionProgress *> (user_data);
tp->error = g_dbus_error_new_for_dbus_error (
"org.projectatomic.rpmostreed.Error.Failed",
"Bus owner changed, aborting. This likely means the daemon crashed; "
"check logs with `journalctl -xe`."
);
transaction_progress_end (tp);
}
// Transaction connection error handling
static RPMOSTreeTransaction *
transaction_connect (const char *transaction_address, GCancellable *cancellable, GError **error)
{
GLNX_AUTO_PREFIX_ERROR ("Failed to connect to client transaction", error);
g_autoptr (GDBusConnection) peer_connection = g_dbus_connection_new_for_address_sync (
transaction_address,
G_DBUS_CONNECTION_FLAGS_AUTHENTICATION_CLIENT,
NULL, cancellable, error
);
if (peer_connection == NULL)
return NULL;
return rpmostree_transaction_proxy_new_sync (
peer_connection, G_DBUS_PROXY_FLAGS_NONE, NULL, "/",
cancellable, error
);
}
User-Facing Error Display
// Progress and error display
static void
transaction_progress_signal_handler (GDBusConnection *connection, const char *sender_name,
const char *object_path, const char *interface_name,
const char *signal_name, GVariant *parameters,
gpointer user_data)
{
auto tp = static_cast<TransactionProgress *> (user_data);
if (g_strcmp0 (signal_name, "Message") == 0) {
const char *message;
g_variant_get (parameters, "(&s)", &message);
if (!tp->progress) {
tp->progress = TRUE;
rpmostreecxx::console_progress_begin_task (message);
} else {
rpmostreecxx::console_progress_set_message (message);
}
} else if (g_strcmp0 (signal_name, "PercentProgress") == 0) {
guint percentage;
const char *message;
g_variant_get (parameters, "(u&s)", &percentage, &message);
if (!tp->progress) {
tp->progress = TRUE;
rpmostreecxx::console_progress_begin_percent (message);
}
rpmostreecxx::console_progress_update (percentage);
}
}
Key Characteristics:
- Error Context: Errors include helpful context and recovery suggestions
- Progress Integration: Error messages integrated with progress display
- User Guidance: Clear instructions for troubleshooting (e.g.,
journalctl -xe) - Graceful Degradation: Client continues operation when possible
4. Rust Error Handling Integration
Error Type Definitions
// From rust/src/lib.rs
/// APIs defined here are automatically bridged between Rust and C++ using https://cxx.rs/
///
/// # Error handling
///
/// For fallible APIs that return a `Result<T>`:
///
/// - Use `Result<T>` inside `lib.rs` below
/// - On the Rust *implementation* side, use `CxxResult<T>` which does error
/// formatting in a more preferred way
/// - On the C++ side, use our custom `CXX_TRY` API which converts the C++ exception
/// into a GError. In the future, we might try a hard switch to C++ exceptions
/// instead, but at the moment having two is problematic, so we prefer `GError`.
System Host Type Validation
// From rust/src/client.rs
/// Return an error if the current system host type does not match expected.
pub(crate) fn require_system_host_type(expected: SystemHostType) -> CxxResult<()> {
let current = get_system_host_type()?;
if current != expected {
let expected = system_host_type_str(&expected);
let current = system_host_type_str(¤t);
return Err(format!(
"This command requires an {expected} system; found: {current}"
).into());
}
Ok(())
}
/// Classify the running system.
#[derive(Clone, Debug)]
pub(crate) enum SystemHostType {
OstreeContainer,
OstreeHost,
Unknown,
}
Key Characteristics:
- Hybrid Approach: Rust
Result<T>bridged to C++GError - Type Safety: Rust enums for error classification
- Context Preservation: Error messages include system context
- Bridging:
CxxResult<T>for Rust-C++ boundary
🔄 Error Flow Patterns
1. Error Propagation Flow
System Error → Daemon → DBus Error → CLI Client → User Display
Detailed Flow:
- System Operation Fails (e.g., OSTree operation, file permission)
- Daemon Catches Error and creates appropriate error code
- DBus Error Sent to connected clients with error details
- CLI Client Receives Error and formats for user display
- User Sees Error with context and recovery suggestions
2. Transaction Error Handling Flow
Transaction Start → Operation Execution → Error Detection → Rollback → Error Reporting
Detailed Flow:
- Transaction Begins with sysroot locking
- Operations Execute in sequence
- Error Detected during any operation
- Automatic Rollback of completed operations
- Sysroot Unlocked and resources cleaned up
- Error Reported to all connected clients
- Transaction Terminated with error state
3. Client Error Recovery Flow
Error Received → Context Analysis → Recovery Attempt → Fallback → User Notification
Detailed Flow:
- Error Received from daemon via DBus
- Context Analyzed (error type, system state)
- Recovery Attempted (retry, alternative approach)
- Fallback Executed if recovery fails
- User Notified of error and recovery status
📊 Error Handling Responsibility Matrix
| Error Type | CLI Client | Daemon | Notes |
|---|---|---|---|
| Command Parsing | ✅ Primary | ❌ None | CLI validates user input |
| DBus Communication | ✅ Client | ✅ Server | Both handle connection errors |
| OSTree Operations | ❌ None | ✅ Primary | Daemon handles all OSTree errors |
| Package Management | ❌ None | ✅ Primary | Daemon handles APT/RPM errors |
| Transaction Errors | ✅ Display | ✅ Management | Daemon manages, CLI displays |
| System Errors | ❌ None | ✅ Primary | Daemon handles system-level errors |
| User Input Errors | ✅ Primary | ❌ None | CLI validates before sending |
| Recovery Actions | ✅ Primary | ✅ Support | CLI guides user, daemon executes |
🚀 apt-ostree Error Handling Implementation Strategy
1. Error Type Definitions
Core Error Types
// daemon/src/errors.rs
use thiserror::Error;
#[derive(Error, Debug)]
pub enum AptOstreeError {
#[error("Operation failed: {message}")]
OperationFailed { message: String },
#[error("Invalid sysroot: {path}")]
InvalidSysroot { path: String },
#[error("Not authorized: {operation}")]
NotAuthorized { operation: String },
#[error("Update in progress")]
UpdateInProgress,
#[error("Invalid package reference: {refspec}")]
InvalidPackageRef { refspec: String },
#[error("Transaction failed: {reason}")]
TransactionFailed { reason: String },
#[error("OSTree error: {source}")]
OstreeError { #[from] source: ostree::Error },
#[error("APT error: {source}")]
AptError { #[from] source: apt_pkg_native::Error },
#[error("System error: {source}")]
SystemError { #[from] source: std::io::Error },
}
impl AptOstreeError {
pub fn dbus_error_code(&self) -> &'static str {
match self {
Self::OperationFailed { .. } => "org.projectatomic.aptostree.Error.Failed",
Self::InvalidSysroot { .. } => "org.projectatomic.aptostree.Error.InvalidSysroot",
Self::NotAuthorized { .. } => "org.projectatomic.aptostree.Error.NotAuthorized",
Self::UpdateInProgress => "org.projectatomic.aptostree.Error.UpdateInProgress",
Self::InvalidPackageRef { .. } => "org.projectatomic.aptostree.Error.InvalidPackageRef",
Self::TransactionFailed { .. } => "org.projectatomic.aptostree.Error.TransactionFailed",
_ => "org.projectatomic.aptostree.Error.Unknown",
}
}
}
DBus Error Integration
// daemon/src/dbus_errors.rs
use zbus::fdo;
pub fn convert_to_dbus_error(error: &AptOstreeError) -> fdo::Error {
match error {
AptOstreeError::NotAuthorized { operation } => {
fdo::Error::PermissionDenied(format!("Not authorized for: {}", operation))
}
AptOstreeError::UpdateInProgress => {
fdo::Error::Failed("Update operation already in progress".into())
}
AptOstreeError::TransactionFailed { reason } => {
fdo::Error::Failed(format!("Transaction failed: {}", reason))
}
_ => {
fdo::Error::Failed(error.to_string())
}
}
}
2. Transaction Error Management
Transaction Error Handling
// daemon/src/transaction.rs
impl Transaction {
pub async fn execute(&mut self, daemon: &AptOstreeDaemon) -> Result<(), AptOstreeError> {
self.state = TransactionState::InProgress;
// Lock sysroot
self.sysroot_locked = true;
// Execute operations with error handling
for operation in &self.operations {
match self.execute_operation(operation, daemon).await {
Ok(()) => {
// Operation successful, continue
self.emit_progress(operation, 100, "Completed").await;
}
Err(error) => {
// Operation failed, rollback and return error
self.rollback().await?;
return Err(error);
}
}
}
self.state = TransactionState::Committed;
self.sysroot_locked = false;
Ok(())
}
async fn rollback(&mut self) -> Result<(), AptOstreeError> {
// Rollback completed operations
for operation in self.completed_operations.iter().rev() {
self.rollback_operation(operation).await?;
}
// Unlock sysroot
if self.sysroot_locked {
self.unlock_sysroot().await?;
self.sysroot_locked = false;
}
self.state = TransactionState::RolledBack;
Ok(())
}
}
3. Client Error Handling
CLI Error Display
// src/client.rs
impl AptOstreeClient {
pub async fn handle_dbus_error(&self, error: &fdo::Error) -> String {
match error {
fdo::Error::PermissionDenied(message) => {
format!("❌ Permission denied: {}. Try running with sudo.", message)
}
fdo::Error::Failed(message) => {
format!("❌ Operation failed: {}. Check daemon logs for details.", message)
}
fdo::Error::InvalidArgs(message) => {
format!("❌ Invalid arguments: {}. Check command syntax.", message)
}
_ => {
format!("❌ Unexpected error: {}. Please report this issue.", error)
}
}
}
pub async fn install_packages(&self, transaction_id: &str, packages: Vec<String>) -> Result<bool, Error> {
match self.daemon.install_packages(transaction_id, packages).await {
Ok(success) => Ok(success),
Err(error) => {
let user_message = self.handle_dbus_error(&error).await;
eprintln!("{}", user_message);
// Provide recovery suggestions
eprintln!("💡 Recovery suggestions:");
eprintln!(" • Check if daemon is running: systemctl status apt-ostreed");
eprintln!(" • Check daemon logs: journalctl -u apt-ostreed -f");
eprintln!(" • Verify package names: apt search <package>");
Err(error)
}
}
}
}
4. Error Recovery Strategies
Automatic Recovery
// daemon/src/recovery.rs
pub struct ErrorRecovery {
max_retries: u32,
retry_delay: Duration,
}
impl ErrorRecovery {
pub async fn retry_operation<F, T, E>(
&self,
operation: F,
operation_name: &str,
) -> Result<T, E>
where
F: Fn() -> Future<Output = Result<T, E>> + Send + Sync,
E: std::error::Error + Send + Sync + 'static,
{
let mut attempts = 0;
let mut last_error = None;
while attempts < self.max_retries {
match operation().await {
Ok(result) => return Ok(result),
Err(error) => {
attempts += 1;
last_error = Some(error);
if attempts < self.max_retries {
tracing::warn!(
"{} failed (attempt {}/{}), retrying in {:?}...",
operation_name, attempts, self.max_retries, self.retry_delay
);
tokio::time::sleep(self.retry_delay).await;
}
}
}
}
Err(last_error.unwrap())
}
}
Fallback Operations
// src/fallback.rs
impl AptOstreeClient {
pub async fn install_packages_with_fallback(&self, packages: &[String]) -> Result<(), Error> {
// Try daemon first
if let Ok(client) = AptOstreeClient::new().await {
match client.install_packages(packages).await {
Ok(()) => return Ok(()),
Err(error) => {
tracing::warn!("Daemon installation failed: {}", error);
// Fall through to fallback
}
}
}
// Fallback to direct operations (limited functionality)
tracing::info!("Using fallback installation mode");
self.install_packages_direct(packages).await
}
}
🎯 Key Implementation Principles
1. Error Classification
- User Errors: Invalid input, permission issues
- System Errors: OSTree failures, file system issues
- Network Errors: Package download failures
- Transaction Errors: Rollback failures, state corruption
2. Error Recovery Priority
- Automatic Recovery: Retry operations, rollback transactions
- Graceful Degradation: Fallback to limited functionality
- User Guidance: Clear error messages with recovery steps
- Logging: Comprehensive error logging for debugging
3. User Experience
- Clear Messages: Error messages explain what went wrong
- Recovery Steps: Provide specific actions to resolve issues
- Progress Integration: Errors integrated with progress display
- Context Preservation: Maintain context across error boundaries
4. System Reliability
- Transaction Safety: Failed operations don't leave system in bad state
- Resource Cleanup: Proper cleanup of locked resources
- Rollback Support: Automatic rollback of failed operations
- State Consistency: Maintain consistent system state
This error handling analysis provides the foundation for implementing robust error handling in apt-ostree that maintains the reliability and user experience standards established by rpm-ostree while adapting to the Debian/Ubuntu ecosystem.