apt-ostree/docs/monitoring.md
robojerk 0ba99d6195 OCI Integration & Container Image Generation Complete! 🎉
FEAT: Complete OCI integration with container image generation capabilities

- Add comprehensive OCI module (src/oci.rs) with full specification compliance
- Implement OciImageBuilder for OSTree commit to container image conversion
- Add OciRegistry for push/pull operations with authentication support
- Create OciUtils for image validation, inspection, and format conversion
- Support both OCI and Docker image formats with proper content addressing
- Add SHA256 digest calculation for all image components
- Implement gzip compression for filesystem layers

CLI: Add complete OCI command suite
- apt-ostree oci build - Build OCI images from OSTree commits
- apt-ostree oci push - Push images to container registries
- apt-ostree oci pull - Pull images from registries
- apt-ostree oci inspect - Inspect image information
- apt-ostree oci validate - Validate image integrity
- apt-ostree oci convert - Convert between image formats

COMPOSE: Enhance compose workflow with OCI integration
- apt-ostree compose build-image - Convert deployments to OCI images
- apt-ostree compose container-encapsulate - Generate container images from commits
- apt-ostree compose image - Generate container images from treefiles

ARCH: Add OCI layer to project architecture
- Integrate OCI manager into lib.rs and main.rs
- Add proper error handling and recovery mechanisms
- Include comprehensive testing and validation
- Create test script for OCI functionality validation

DEPS: Add sha256 crate for content addressing
- Update Cargo.toml with sha256 dependency
- Ensure proper async/await handling with tokio::process::Command
- Fix borrow checker issues and lifetime management

DOCS: Update project documentation
- Add OCI integration summary documentation
- Update todo.md with milestone 9 completion
- Include usage examples and workflow documentation
2025-07-19 23:05:39 +00:00

12 KiB

APT-OSTree Monitoring and Logging

Overview

APT-OSTree includes a comprehensive monitoring and logging system that provides:

  • Structured Logging: JSON-formatted logs with timestamps and context
  • Metrics Collection: System, performance, and transaction metrics
  • Health Checks: Automated health monitoring of system components
  • Real-time Monitoring: Background service for continuous monitoring
  • Export Capabilities: Metrics export in JSON format

Architecture

Components

  1. Monitoring Manager (src/monitoring.rs)

    • Core monitoring functionality
    • Metrics collection and storage
    • Health check execution
    • Performance monitoring
  2. Monitoring Service (src/bin/monitoring-service.rs)

    • Background service for continuous monitoring
    • Automated metrics collection
    • Health check scheduling
    • Metrics export
  3. CLI Integration (src/main.rs)

    • Monitoring commands
    • Real-time status display
    • Metrics export

Data Flow

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   CLI Commands  │───▶│ Monitoring Manager│───▶│ Metrics Storage │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │
                                ▼
                       ┌──────────────────┐
                       │ Monitoring Service│
                       └──────────────────┘
                                │
                                ▼
                       ┌──────────────────┐
                       │   Health Checks  │
                       └──────────────────┘

Configuration

Monitoring Configuration

pub struct MonitoringConfig {
    pub log_level: String,                    // "trace", "debug", "info", "warn", "error"
    pub log_file: Option<String>,             // Optional log file path
    pub structured_logging: bool,             // Enable JSON logging
    pub enable_metrics: bool,                 // Enable metrics collection
    pub metrics_interval: u64,                // Metrics collection interval (seconds)
    pub enable_health_checks: bool,           // Enable health checks
    pub health_check_interval: u64,           // Health check interval (seconds)
    pub enable_performance_monitoring: bool,  // Enable performance monitoring
    pub enable_transaction_monitoring: bool,  // Enable transaction monitoring
    pub enable_system_monitoring: bool,       // Enable system resource monitoring
}

Environment Variables

# Log level
export RUST_LOG=info

# Monitoring configuration
export APT_OSTREE_MONITORING_ENABLED=1
export APT_OSTREE_METRICS_INTERVAL=60
export APT_OSTREE_HEALTH_CHECK_INTERVAL=300

Usage

CLI Commands

Show Monitoring Status

# Show general monitoring status
apt-ostree monitoring

# Export metrics as JSON
apt-ostree monitoring --export

# Run health checks
apt-ostree monitoring --health

# Show performance metrics
apt-ostree monitoring --performance

Monitoring Service

# Start monitoring service
apt-ostree-monitoring start

# Stop monitoring service
apt-ostree-monitoring stop

# Show service status
apt-ostree-monitoring status

# Run health check cycle
apt-ostree-monitoring health-check

# Export metrics
apt-ostree-monitoring export-metrics

Systemd Service

# Enable and start monitoring service
sudo systemctl enable apt-ostree-monitoring
sudo systemctl start apt-ostree-monitoring

# Check service status
sudo systemctl status apt-ostree-monitoring

# View service logs
sudo journalctl -u apt-ostree-monitoring -f

# Stop service
sudo systemctl stop apt-ostree-monitoring

Metrics

System Metrics

{
  "timestamp": "2024-12-19T10:30:00Z",
  "cpu_usage": 15.5,
  "memory_usage": 8589934592,
  "total_memory": 17179869184,
  "disk_usage": 107374182400,
  "total_disk": 1073741824000,
  "active_transactions": 0,
  "pending_deployments": 1,
  "ostree_repo_size": 5368709120,
  "apt_cache_size": 1073741824,
  "uptime": 86400,
  "load_average": [1.2, 1.1, 1.0]
}

Performance Metrics

{
  "timestamp": "2024-12-19T10:30:00Z",
  "operation_type": "package_installation",
  "duration_ms": 1500,
  "success": true,
  "error_message": null,
  "context": {
    "packages_count": "5",
    "total_size": "52428800"
  }
}

Transaction Metrics

{
  "transaction_id": "tx-12345",
  "transaction_type": "install",
  "start_time": "2024-12-19T10:25:00Z",
  "end_time": "2024-12-19T10:26:30Z",
  "duration_ms": 90000,
  "success": true,
  "error_message": null,
  "packages_count": 5,
  "packages_size": 52428800,
  "progress": 1.0
}

Health Checks

Available Health Checks

  1. OSTree Repository Health

    • Repository integrity
    • Commit accessibility
    • Storage space
  2. APT Database Health

    • Database integrity
    • Package cache status
    • Repository connectivity
  3. System Resources

    • Memory availability
    • Disk space
    • CPU usage
  4. Daemon Health

    • Service status
    • D-Bus connectivity
    • Authentication

Health Check Results

{
  "check_name": "ostree_repository",
  "status": "healthy",
  "message": "OSTree repository is healthy",
  "timestamp": "2024-12-19T10:30:00Z",
  "duration_ms": 150,
  "details": {
    "repo_size": "5368709120",
    "commit_count": "1250",
    "integrity_check": "passed"
  }
}

Logging

Log Levels

  • TRACE: Detailed debugging information
  • DEBUG: Debugging information
  • INFO: General information
  • WARN: Warning messages
  • ERROR: Error messages

Log Format

Standard Format

2024-12-19T10:30:00.123Z  INFO apt_ostree::monitoring: Health check passed: ostree_repository

JSON Format (Structured Logging)

{
  "timestamp": "2024-12-19T10:30:00.123Z",
  "level": "INFO",
  "target": "apt_ostree::monitoring",
  "message": "Health check passed: ostree_repository",
  "fields": {
    "check_name": "ostree_repository",
    "duration_ms": 150
  }
}

Log Configuration

# Set log level
export RUST_LOG=info

# Enable structured logging
export APT_OSTREE_STRUCTURED_LOGGING=1

# Log to file
export APT_OSTREE_LOG_FILE=/var/log/apt-ostree/app.log

Performance Monitoring

Performance Wrappers

use apt_ostree::monitoring::PerformanceMonitor;

// Monitor an operation
let monitor = PerformanceMonitor::new(
    monitoring_manager.clone(),
    "package_installation",
    context
);

// Record success
monitor.success().await?;

// Record failure
monitor.failure("Package not found".to_string()).await?;

Transaction Monitoring

use apt_ostree::monitoring::TransactionMonitor;

// Start transaction monitoring
let monitor = TransactionMonitor::new(
    monitoring_manager.clone(),
    "tx-12345",
    "install",
    5,
    52428800
);

// Update progress
monitor.update_progress(0.5).await?;

// Complete transaction
monitor.success().await?;

Integration

With Package Manager

The monitoring system integrates with the package manager to track:

  • Package installation/removal operations
  • Transaction progress
  • Performance metrics
  • Error tracking

With OSTree Manager

Integration with OSTree manager provides:

  • Commit metadata extraction
  • Repository health monitoring
  • Deployment tracking
  • Rollback monitoring

With Daemon

The monitoring system works with the daemon to provide:

  • Service health monitoring
  • D-Bus communication tracking
  • Authentication monitoring
  • Transaction state tracking

Troubleshooting

Common Issues

Monitoring Service Not Starting

# Check service status
sudo systemctl status apt-ostree-monitoring

# Check logs
sudo journalctl -u apt-ostree-monitoring -f

# Check permissions
ls -la /usr/bin/apt-ostree-monitoring
ls -la /var/log/apt-ostree/

Metrics Not Being Collected

# Check monitoring configuration
apt-ostree monitoring --export

# Verify service is running
sudo systemctl is-active apt-ostree-monitoring

# Check metrics file
cat /var/log/apt-ostree/metrics.json

Health Checks Failing

# Run health checks manually
apt-ostree monitoring --health

# Check specific components
apt-ostree status
ostree log debian/stable/x86_64

Debug Mode

# Enable debug logging
export RUST_LOG=debug

# Run with debug output
apt-ostree-monitoring start

# Check debug logs
sudo journalctl -u apt-ostree-monitoring --log-level=debug

Best Practices

Production Deployment

  1. Enable Structured Logging

    export APT_OSTREE_STRUCTURED_LOGGING=1
    
  2. Configure Log Rotation

    # Add to /etc/logrotate.d/apt-ostree
    /var/log/apt-ostree/*.log {
        daily
        rotate 30
        compress
        delaycompress
        missingok
        notifempty
    }
    
  3. Monitor Metrics Storage

    # Check metrics file size
    du -sh /var/log/apt-ostree/metrics.json
    
    # Archive old metrics
    mv /var/log/apt-ostree/metrics.json /var/log/apt-ostree/metrics-$(date +%Y%m%d).json
    
  4. Set Up Alerts

    # Monitor health check failures
    journalctl -u apt-ostree-monitoring | grep "CRITICAL"
    
    # Monitor high resource usage
    apt-ostree monitoring --performance | grep -E "(cpu_usage|memory_usage)"
    

Development

  1. Use Performance Monitoring

    let monitor = PerformanceMonitor::new(manager, "operation", context);
    // ... perform operation ...
    monitor.success().await?;
    
  2. Add Health Checks

    // Add custom health checks
    async fn check_custom_component(&self) -> HealthCheckResult {
        // Implementation
    }
    
  3. Monitor Transactions

    let monitor = TransactionMonitor::new(manager, id, type, count, size);
    // ... perform transaction ...
    monitor.success().await?;
    

API Reference

MonitoringManager

impl MonitoringManager {
    pub fn new(config: MonitoringConfig) -> AptOstreeResult<Self>
    pub fn init_logging(&self) -> AptOstreeResult<()>
    pub async fn record_system_metrics(&self) -> AptOstreeResult<()>
    pub async fn record_performance_metrics(&self, ...) -> AptOstreeResult<()>
    pub async fn start_transaction_monitoring(&self, ...) -> AptOstreeResult<()>
    pub async fn run_health_checks(&self) -> AptOstreeResult<Vec<HealthCheckResult>>
    pub async fn get_statistics(&self) -> AptOstreeResult<MonitoringStatistics>
    pub async fn export_metrics(&self) -> AptOstreeResult<String>
}

PerformanceMonitor

impl PerformanceMonitor {
    pub fn new(manager: Arc<MonitoringManager>, operation: &str, context: HashMap<String, String>) -> Self
    pub async fn success(self) -> AptOstreeResult<()>
    pub async fn failure(self, error_message: String) -> AptOstreeResult<()>
}

TransactionMonitor

impl TransactionMonitor {
    pub fn new(manager: Arc<MonitoringManager>, id: &str, type: &str, count: u32, size: u64) -> Self
    pub async fn update_progress(&self, progress: f64) -> AptOstreeResult<()>
    pub async fn success(self) -> AptOstreeResult<()>
    pub async fn failure(self, error_message: String) -> AptOstreeResult<()>
}

Conclusion

The APT-OSTree monitoring and logging system provides comprehensive visibility into system operations, performance, and health. It enables proactive monitoring, troubleshooting, and optimization of the APT-OSTree system.

For more information, see: