FEAT: Complete OCI integration with container image generation capabilities - Add comprehensive OCI module (src/oci.rs) with full specification compliance - Implement OciImageBuilder for OSTree commit to container image conversion - Add OciRegistry for push/pull operations with authentication support - Create OciUtils for image validation, inspection, and format conversion - Support both OCI and Docker image formats with proper content addressing - Add SHA256 digest calculation for all image components - Implement gzip compression for filesystem layers CLI: Add complete OCI command suite - apt-ostree oci build - Build OCI images from OSTree commits - apt-ostree oci push - Push images to container registries - apt-ostree oci pull - Pull images from registries - apt-ostree oci inspect - Inspect image information - apt-ostree oci validate - Validate image integrity - apt-ostree oci convert - Convert between image formats COMPOSE: Enhance compose workflow with OCI integration - apt-ostree compose build-image - Convert deployments to OCI images - apt-ostree compose container-encapsulate - Generate container images from commits - apt-ostree compose image - Generate container images from treefiles ARCH: Add OCI layer to project architecture - Integrate OCI manager into lib.rs and main.rs - Add proper error handling and recovery mechanisms - Include comprehensive testing and validation - Create test script for OCI functionality validation DEPS: Add sha256 crate for content addressing - Update Cargo.toml with sha256 dependency - Ensure proper async/await handling with tokio::process::Command - Fix borrow checker issues and lifetime management DOCS: Update project documentation - Add OCI integration summary documentation - Update todo.md with milestone 9 completion - Include usage examples and workflow documentation
513 lines
No EOL
12 KiB
Markdown
513 lines
No EOL
12 KiB
Markdown
# APT-OSTree Monitoring and Logging
|
|
|
|
## Overview
|
|
|
|
APT-OSTree includes a comprehensive monitoring and logging system that provides:
|
|
|
|
- **Structured Logging**: JSON-formatted logs with timestamps and context
|
|
- **Metrics Collection**: System, performance, and transaction metrics
|
|
- **Health Checks**: Automated health monitoring of system components
|
|
- **Real-time Monitoring**: Background service for continuous monitoring
|
|
- **Export Capabilities**: Metrics export in JSON format
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
1. **Monitoring Manager** (`src/monitoring.rs`)
|
|
- Core monitoring functionality
|
|
- Metrics collection and storage
|
|
- Health check execution
|
|
- Performance monitoring
|
|
|
|
2. **Monitoring Service** (`src/bin/monitoring-service.rs`)
|
|
- Background service for continuous monitoring
|
|
- Automated metrics collection
|
|
- Health check scheduling
|
|
- Metrics export
|
|
|
|
3. **CLI Integration** (`src/main.rs`)
|
|
- Monitoring commands
|
|
- Real-time status display
|
|
- Metrics export
|
|
|
|
### Data Flow
|
|
|
|
```
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
|
│ CLI Commands │───▶│ Monitoring Manager│───▶│ Metrics Storage │
|
|
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ Monitoring Service│
|
|
└──────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ Health Checks │
|
|
└──────────────────┘
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Monitoring Configuration
|
|
|
|
```rust
|
|
pub struct MonitoringConfig {
|
|
pub log_level: String, // "trace", "debug", "info", "warn", "error"
|
|
pub log_file: Option<String>, // Optional log file path
|
|
pub structured_logging: bool, // Enable JSON logging
|
|
pub enable_metrics: bool, // Enable metrics collection
|
|
pub metrics_interval: u64, // Metrics collection interval (seconds)
|
|
pub enable_health_checks: bool, // Enable health checks
|
|
pub health_check_interval: u64, // Health check interval (seconds)
|
|
pub enable_performance_monitoring: bool, // Enable performance monitoring
|
|
pub enable_transaction_monitoring: bool, // Enable transaction monitoring
|
|
pub enable_system_monitoring: bool, // Enable system resource monitoring
|
|
}
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
```bash
|
|
# Log level
|
|
export RUST_LOG=info
|
|
|
|
# Monitoring configuration
|
|
export APT_OSTREE_MONITORING_ENABLED=1
|
|
export APT_OSTREE_METRICS_INTERVAL=60
|
|
export APT_OSTREE_HEALTH_CHECK_INTERVAL=300
|
|
```
|
|
|
|
## Usage
|
|
|
|
### CLI Commands
|
|
|
|
#### Show Monitoring Status
|
|
|
|
```bash
|
|
# Show general monitoring status
|
|
apt-ostree monitoring
|
|
|
|
# Export metrics as JSON
|
|
apt-ostree monitoring --export
|
|
|
|
# Run health checks
|
|
apt-ostree monitoring --health
|
|
|
|
# Show performance metrics
|
|
apt-ostree monitoring --performance
|
|
```
|
|
|
|
#### Monitoring Service
|
|
|
|
```bash
|
|
# Start monitoring service
|
|
apt-ostree-monitoring start
|
|
|
|
# Stop monitoring service
|
|
apt-ostree-monitoring stop
|
|
|
|
# Show service status
|
|
apt-ostree-monitoring status
|
|
|
|
# Run health check cycle
|
|
apt-ostree-monitoring health-check
|
|
|
|
# Export metrics
|
|
apt-ostree-monitoring export-metrics
|
|
```
|
|
|
|
### Systemd Service
|
|
|
|
```bash
|
|
# Enable and start monitoring service
|
|
sudo systemctl enable apt-ostree-monitoring
|
|
sudo systemctl start apt-ostree-monitoring
|
|
|
|
# Check service status
|
|
sudo systemctl status apt-ostree-monitoring
|
|
|
|
# View service logs
|
|
sudo journalctl -u apt-ostree-monitoring -f
|
|
|
|
# Stop service
|
|
sudo systemctl stop apt-ostree-monitoring
|
|
```
|
|
|
|
## Metrics
|
|
|
|
### System Metrics
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2024-12-19T10:30:00Z",
|
|
"cpu_usage": 15.5,
|
|
"memory_usage": 8589934592,
|
|
"total_memory": 17179869184,
|
|
"disk_usage": 107374182400,
|
|
"total_disk": 1073741824000,
|
|
"active_transactions": 0,
|
|
"pending_deployments": 1,
|
|
"ostree_repo_size": 5368709120,
|
|
"apt_cache_size": 1073741824,
|
|
"uptime": 86400,
|
|
"load_average": [1.2, 1.1, 1.0]
|
|
}
|
|
```
|
|
|
|
### Performance Metrics
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2024-12-19T10:30:00Z",
|
|
"operation_type": "package_installation",
|
|
"duration_ms": 1500,
|
|
"success": true,
|
|
"error_message": null,
|
|
"context": {
|
|
"packages_count": "5",
|
|
"total_size": "52428800"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Transaction Metrics
|
|
|
|
```json
|
|
{
|
|
"transaction_id": "tx-12345",
|
|
"transaction_type": "install",
|
|
"start_time": "2024-12-19T10:25:00Z",
|
|
"end_time": "2024-12-19T10:26:30Z",
|
|
"duration_ms": 90000,
|
|
"success": true,
|
|
"error_message": null,
|
|
"packages_count": 5,
|
|
"packages_size": 52428800,
|
|
"progress": 1.0
|
|
}
|
|
```
|
|
|
|
## Health Checks
|
|
|
|
### Available Health Checks
|
|
|
|
1. **OSTree Repository Health**
|
|
- Repository integrity
|
|
- Commit accessibility
|
|
- Storage space
|
|
|
|
2. **APT Database Health**
|
|
- Database integrity
|
|
- Package cache status
|
|
- Repository connectivity
|
|
|
|
3. **System Resources**
|
|
- Memory availability
|
|
- Disk space
|
|
- CPU usage
|
|
|
|
4. **Daemon Health**
|
|
- Service status
|
|
- D-Bus connectivity
|
|
- Authentication
|
|
|
|
### Health Check Results
|
|
|
|
```json
|
|
{
|
|
"check_name": "ostree_repository",
|
|
"status": "healthy",
|
|
"message": "OSTree repository is healthy",
|
|
"timestamp": "2024-12-19T10:30:00Z",
|
|
"duration_ms": 150,
|
|
"details": {
|
|
"repo_size": "5368709120",
|
|
"commit_count": "1250",
|
|
"integrity_check": "passed"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Logging
|
|
|
|
### Log Levels
|
|
|
|
- **TRACE**: Detailed debugging information
|
|
- **DEBUG**: Debugging information
|
|
- **INFO**: General information
|
|
- **WARN**: Warning messages
|
|
- **ERROR**: Error messages
|
|
|
|
### Log Format
|
|
|
|
#### Standard Format
|
|
```
|
|
2024-12-19T10:30:00.123Z INFO apt_ostree::monitoring: Health check passed: ostree_repository
|
|
```
|
|
|
|
#### JSON Format (Structured Logging)
|
|
```json
|
|
{
|
|
"timestamp": "2024-12-19T10:30:00.123Z",
|
|
"level": "INFO",
|
|
"target": "apt_ostree::monitoring",
|
|
"message": "Health check passed: ostree_repository",
|
|
"fields": {
|
|
"check_name": "ostree_repository",
|
|
"duration_ms": 150
|
|
}
|
|
}
|
|
```
|
|
|
|
### Log Configuration
|
|
|
|
```bash
|
|
# Set log level
|
|
export RUST_LOG=info
|
|
|
|
# Enable structured logging
|
|
export APT_OSTREE_STRUCTURED_LOGGING=1
|
|
|
|
# Log to file
|
|
export APT_OSTREE_LOG_FILE=/var/log/apt-ostree/app.log
|
|
```
|
|
|
|
## Performance Monitoring
|
|
|
|
### Performance Wrappers
|
|
|
|
```rust
|
|
use apt_ostree::monitoring::PerformanceMonitor;
|
|
|
|
// Monitor an operation
|
|
let monitor = PerformanceMonitor::new(
|
|
monitoring_manager.clone(),
|
|
"package_installation",
|
|
context
|
|
);
|
|
|
|
// Record success
|
|
monitor.success().await?;
|
|
|
|
// Record failure
|
|
monitor.failure("Package not found".to_string()).await?;
|
|
```
|
|
|
|
### Transaction Monitoring
|
|
|
|
```rust
|
|
use apt_ostree::monitoring::TransactionMonitor;
|
|
|
|
// Start transaction monitoring
|
|
let monitor = TransactionMonitor::new(
|
|
monitoring_manager.clone(),
|
|
"tx-12345",
|
|
"install",
|
|
5,
|
|
52428800
|
|
);
|
|
|
|
// Update progress
|
|
monitor.update_progress(0.5).await?;
|
|
|
|
// Complete transaction
|
|
monitor.success().await?;
|
|
```
|
|
|
|
## Integration
|
|
|
|
### With Package Manager
|
|
|
|
The monitoring system integrates with the package manager to track:
|
|
|
|
- Package installation/removal operations
|
|
- Transaction progress
|
|
- Performance metrics
|
|
- Error tracking
|
|
|
|
### With OSTree Manager
|
|
|
|
Integration with OSTree manager provides:
|
|
|
|
- Commit metadata extraction
|
|
- Repository health monitoring
|
|
- Deployment tracking
|
|
- Rollback monitoring
|
|
|
|
### With Daemon
|
|
|
|
The monitoring system works with the daemon to provide:
|
|
|
|
- Service health monitoring
|
|
- D-Bus communication tracking
|
|
- Authentication monitoring
|
|
- Transaction state tracking
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### Monitoring Service Not Starting
|
|
|
|
```bash
|
|
# Check service status
|
|
sudo systemctl status apt-ostree-monitoring
|
|
|
|
# Check logs
|
|
sudo journalctl -u apt-ostree-monitoring -f
|
|
|
|
# Check permissions
|
|
ls -la /usr/bin/apt-ostree-monitoring
|
|
ls -la /var/log/apt-ostree/
|
|
```
|
|
|
|
#### Metrics Not Being Collected
|
|
|
|
```bash
|
|
# Check monitoring configuration
|
|
apt-ostree monitoring --export
|
|
|
|
# Verify service is running
|
|
sudo systemctl is-active apt-ostree-monitoring
|
|
|
|
# Check metrics file
|
|
cat /var/log/apt-ostree/metrics.json
|
|
```
|
|
|
|
#### Health Checks Failing
|
|
|
|
```bash
|
|
# Run health checks manually
|
|
apt-ostree monitoring --health
|
|
|
|
# Check specific components
|
|
apt-ostree status
|
|
ostree log debian/stable/x86_64
|
|
```
|
|
|
|
### Debug Mode
|
|
|
|
```bash
|
|
# Enable debug logging
|
|
export RUST_LOG=debug
|
|
|
|
# Run with debug output
|
|
apt-ostree-monitoring start
|
|
|
|
# Check debug logs
|
|
sudo journalctl -u apt-ostree-monitoring --log-level=debug
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### Production Deployment
|
|
|
|
1. **Enable Structured Logging**
|
|
```bash
|
|
export APT_OSTREE_STRUCTURED_LOGGING=1
|
|
```
|
|
|
|
2. **Configure Log Rotation**
|
|
```bash
|
|
# Add to /etc/logrotate.d/apt-ostree
|
|
/var/log/apt-ostree/*.log {
|
|
daily
|
|
rotate 30
|
|
compress
|
|
delaycompress
|
|
missingok
|
|
notifempty
|
|
}
|
|
```
|
|
|
|
3. **Monitor Metrics Storage**
|
|
```bash
|
|
# Check metrics file size
|
|
du -sh /var/log/apt-ostree/metrics.json
|
|
|
|
# Archive old metrics
|
|
mv /var/log/apt-ostree/metrics.json /var/log/apt-ostree/metrics-$(date +%Y%m%d).json
|
|
```
|
|
|
|
4. **Set Up Alerts**
|
|
```bash
|
|
# Monitor health check failures
|
|
journalctl -u apt-ostree-monitoring | grep "CRITICAL"
|
|
|
|
# Monitor high resource usage
|
|
apt-ostree monitoring --performance | grep -E "(cpu_usage|memory_usage)"
|
|
```
|
|
|
|
### Development
|
|
|
|
1. **Use Performance Monitoring**
|
|
```rust
|
|
let monitor = PerformanceMonitor::new(manager, "operation", context);
|
|
// ... perform operation ...
|
|
monitor.success().await?;
|
|
```
|
|
|
|
2. **Add Health Checks**
|
|
```rust
|
|
// Add custom health checks
|
|
async fn check_custom_component(&self) -> HealthCheckResult {
|
|
// Implementation
|
|
}
|
|
```
|
|
|
|
3. **Monitor Transactions**
|
|
```rust
|
|
let monitor = TransactionMonitor::new(manager, id, type, count, size);
|
|
// ... perform transaction ...
|
|
monitor.success().await?;
|
|
```
|
|
|
|
## API Reference
|
|
|
|
### MonitoringManager
|
|
|
|
```rust
|
|
impl MonitoringManager {
|
|
pub fn new(config: MonitoringConfig) -> AptOstreeResult<Self>
|
|
pub fn init_logging(&self) -> AptOstreeResult<()>
|
|
pub async fn record_system_metrics(&self) -> AptOstreeResult<()>
|
|
pub async fn record_performance_metrics(&self, ...) -> AptOstreeResult<()>
|
|
pub async fn start_transaction_monitoring(&self, ...) -> AptOstreeResult<()>
|
|
pub async fn run_health_checks(&self) -> AptOstreeResult<Vec<HealthCheckResult>>
|
|
pub async fn get_statistics(&self) -> AptOstreeResult<MonitoringStatistics>
|
|
pub async fn export_metrics(&self) -> AptOstreeResult<String>
|
|
}
|
|
```
|
|
|
|
### PerformanceMonitor
|
|
|
|
```rust
|
|
impl PerformanceMonitor {
|
|
pub fn new(manager: Arc<MonitoringManager>, operation: &str, context: HashMap<String, String>) -> Self
|
|
pub async fn success(self) -> AptOstreeResult<()>
|
|
pub async fn failure(self, error_message: String) -> AptOstreeResult<()>
|
|
}
|
|
```
|
|
|
|
### TransactionMonitor
|
|
|
|
```rust
|
|
impl TransactionMonitor {
|
|
pub fn new(manager: Arc<MonitoringManager>, id: &str, type: &str, count: u32, size: u64) -> Self
|
|
pub async fn update_progress(&self, progress: f64) -> AptOstreeResult<()>
|
|
pub async fn success(self) -> AptOstreeResult<()>
|
|
pub async fn failure(self, error_message: String) -> AptOstreeResult<()>
|
|
}
|
|
```
|
|
|
|
## Conclusion
|
|
|
|
The APT-OSTree monitoring and logging system provides comprehensive visibility into system operations, performance, and health. It enables proactive monitoring, troubleshooting, and optimization of the APT-OSTree system.
|
|
|
|
For more information, see:
|
|
- [System Administration Guide](system-admin.md)
|
|
- [Troubleshooting Guide](troubleshooting.md)
|
|
- [API Documentation](api.md) |