Major improvements: flexible install dir, configurable compose file name for git, enhanced webhook notifications, cross-platform lock, robust rollback, and updated docs.\n\n- Install dir is now user-confirmable and dynamic\n- Added COMPOSE_FILENAME for git stacks\n- Webhook payloads now include git context and rollback events\n- Lock file age check is cross-platform\n- Rollback notifications for success/failure\n- Updated TOML example and documentation\n- Many robustness and UX improvements

This commit is contained in:
robojerk 2025-06-25 15:15:40 -07:00
parent f0dba7cc0a
commit 70486907aa
18 changed files with 3788 additions and 1767 deletions

View file

@ -1,342 +1,648 @@
# Troubleshooting Guide
This guide covers common issues and their solutions when using ComposeSync.
This guide helps you diagnose and resolve common issues with ComposeSync.
## Service Issues
## Quick Diagnostics
### Service Status Check
```bash
# Check if ComposeSync is running
sudo systemctl status composesync
# Check recent logs
sudo journalctl -u composesync -n 20
# Check configuration
sudo cat /opt/composesync/config.toml
```
### Basic Health Check
```bash
# Check service status
if systemctl is-active --quiet composesync; then
echo "✓ ComposeSync is running"
else
echo "✗ ComposeSync is not running"
fi
# Check for recent errors
error_count=$(journalctl -u composesync --since "1 hour ago" | grep -c "ERROR")
echo "Errors in last hour: $error_count"
# Check disk usage
usage=$(df -h /opt/composesync/ | tail -1 | awk '{print $5}' | sed 's/%//')
echo "Disk usage: ${usage}%"
```
## Common Issues
### Service Won't Start
**Problem:** The ComposeSync service fails to start.
**Symptoms:**
- `systemctl status composesync` shows failed status
- Service won't start with `systemctl start composesync`
**Solutions:**
1. Check service status:
**Diagnosis:**
```bash
# Check service status
sudo systemctl status composesync
# View detailed logs
sudo journalctl -u composesync -n 50
# Check configuration syntax
sudo systemctl restart composesync
sudo journalctl -u composesync -n 10
```
**Common Causes:**
1. **Configuration Error**
```bash
sudo systemctl status composesync
# Check TOML syntax
sudo cat /opt/composesync/config.toml
# Test configuration
sudo systemctl restart composesync
sudo journalctl -u composesync -n 20 | grep "ERROR"
```
2. Check logs for errors:
2. **Missing Dependencies**
```bash
sudo journalctl -u composesync -n 50
# Check if required tools are installed
which wget
which git
which docker
which docker-compose
# Install missing tools
sudo apt update
sudo apt install wget git docker.io docker-compose-plugin
```
3. Verify user permissions:
3. **Permission Issues**
```bash
# Ensure the service user is in the docker group
groups YOUR_USERNAME
```
4. Check file permissions:
```bash
# Ensure the service user owns the ComposeSync directory
# Check file permissions
ls -la /opt/composesync/
sudo chown -R YOUR_USERNAME:docker /opt/composesync/
# Fix permissions
sudo chown -R $USER:docker /opt/composesync/
sudo chmod +x /opt/composesync/update-agent.sh
sudo chmod +x /opt/composesync/config-parser.sh
```
### Service Crashes or Stops Unexpectedly
**Problem:** The service runs but crashes or stops unexpectedly.
**Solutions:**
1. Check for configuration errors:
```bash
sudo journalctl -u composesync -f
- Fix configuration syntax errors
- Install missing dependencies
- Correct file permissions
- Check systemd service file
### Configuration Issues
**Symptoms:**
- Service starts but doesn't process stacks
- Stacks are skipped with errors
- Configuration not loaded properly
**Diagnosis:**
```bash
# Check configuration file
sudo cat /opt/composesync/config.toml
# Check for configuration errors
sudo journalctl -u composesync | grep "Configuration"
# Test configuration loading
sudo systemctl restart composesync
sudo journalctl -u composesync -n 20
```
**Common Issues:**
1. **TOML Syntax Error**
```toml
# Incorrect TOML syntax
[global]
UPDATE_INTERVAL_SECONDS = 3600
KEEP_VERSIONS = 10
[immich]
URL = "https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml"
PATH = "/opt/composesync/stacks/immich"
TOOL = "wget"
```
2. Verify your `.env` file syntax:
**Fix:** Ensure proper TOML syntax with correct indentation and quotes.
2. **Missing Required Fields**
```toml
# Missing required fields
[immich]
URL = "https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml"
# Missing PATH and TOOL
```
**Fix:** Add all required fields (URL, PATH, TOOL).
3. **Invalid Paths**
```toml
[immich]
URL = "https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml"
PATH = "/invalid/path" # Path doesn't exist
TOOL = "wget"
```
**Fix:** Create the directory or use a valid path.
**Solutions:**
- Validate TOML syntax
- Ensure all required fields are present
- Create missing directories
- Check file permissions
### Download Failures
**Symptoms:**
- Stacks are skipped with download errors
- Network connectivity issues
- Invalid URLs
**Diagnosis:**
```bash
# Check network connectivity
ping -c 3 github.com
# Test URL manually
wget -O /tmp/test.yml https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
# Check for download errors
sudo journalctl -u composesync | grep "Failed to download"
```
**Common Issues:**
1. **Network Connectivity**
```bash
# Test basic connectivity
ping -c 3 8.8.8.8
# Test DNS resolution
nslookup github.com
# Check proxy settings
echo $http_proxy
echo $https_proxy
```
2. **Invalid URLs**
```toml
# Incorrect URL format
[immich]
URL = "https://invalid-url.com/docker-compose.yml"
PATH = "/opt/composesync/stacks/immich"
TOOL = "wget"
```
3. **Authentication Required**
```toml
# Private repository without authentication
[private-app]
URL = "https://github.com/user/private-repo.git"
PATH = "/opt/composesync/stacks/private"
TOOL = "git"
```
**Solutions:**
- Check network connectivity
- Verify URLs are correct and accessible
- Configure authentication for private repositories
- Check firewall settings
### Docker Compose Failures
**Symptoms:**
- Updates fail with Docker Compose errors
- Services don't start after updates
- Rollback occurs frequently
**Diagnosis:**
```bash
# Check Docker Compose syntax
cd /opt/composesync/stacks/immich
docker compose config
# Check service status
docker compose ps
# View service logs
docker compose logs
```
**Common Issues:**
1. **Invalid Compose File**
```bash
# Test compose file syntax
docker compose -f /opt/composesync/stacks/immich/docker-compose.yml config
# Check for syntax errors
source /opt/composesync/.env
docker compose -f /opt/composesync/stacks/immich/docker-compose.yml config 2>&1 | grep "ERROR"
```
3. Test with dry-run mode:
```env
DRY_RUN=true
```
## Download Issues
### Failed Downloads
**Problem:** ComposeSync fails to download compose files.
**Solutions:**
1. Check network connectivity:
2. **Port Conflicts**
```bash
# Test if the URL is accessible
wget -q --spider https://your-url.com/docker-compose.yml
echo $?
```
2. Verify URLs in your configuration:
```bash
# Check your .env file
grep STACK_.*_URL /opt/composesync/.env
```
3. Check for authentication requirements:
- Some URLs may require authentication
- Consider using Git repositories instead
### Git Repository Issues
**Problem:** Git operations fail.
**Solutions:**
1. Verify repository access:
```bash
# Test git clone manually
git clone --quiet https://github.com/user/repo.git /tmp/test
```
2. Check Git subpath configuration:
```bash
# Ensure the subpath exists in the repository
git ls-tree -r --name-only HEAD | grep docker-compose.yml
```
3. Verify branch/tag exists:
```bash
# List available branches/tags
git ls-remote --heads https://github.com/user/repo.git
git ls-remote --tags https://github.com/user/repo.git
```
## Docker Compose Issues
### Update Failures
**Problem:** Docker Compose updates fail and trigger rollback.
**Solutions:**
1. Check Docker Compose syntax:
```bash
# Validate compose file manually
docker compose -f /path/to/docker-compose.yml config
```
2. Check for port conflicts:
```bash
# Check what's using the ports
# Check for port conflicts
netstat -tulpn | grep :80
netstat -tulpn | grep :443
# Check which services are using ports
sudo lsof -i :80
sudo lsof -i :443
```
3. Verify override file syntax:
3. **Resource Issues**
```bash
# Test with override file
docker compose -f docker-compose.yml -f docker-compose.override.yml config
# Check available disk space
df -h
# Check available memory
free -h
# Check Docker disk usage
docker system df
```
### Rollback Failures
**Problem:** Both the update and rollback fail.
**Solutions:**
1. Check backup files:
- Fix compose file syntax errors
- Resolve port conflicts
- Free up disk space and memory
- Check Docker daemon status
### Permission Issues
**Symptoms:**
- Cannot write to stack directories
- Cannot access Docker socket
- Permission denied errors
**Diagnosis:**
```bash
# Check file permissions
ls -la /opt/composesync/
# Check user groups
groups $USER
# Check Docker socket permissions
ls -la /var/run/docker.sock
# Test Docker access
docker ps
```
**Common Issues:**
1. **User Not in Docker Group**
```bash
# Verify backups exist
ls -la /opt/composesync/stacks/*/backups/
# Check if user is in docker group
groups $USER | grep docker
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, or run:
newgrp docker
```
2. Manual rollback:
2. **Incorrect File Ownership**
```bash
# Manually restore from backup
cp /opt/composesync/stacks/stackname/backups/backup-*/docker-compose.yml /opt/composesync/stacks/stackname/
```
3. Check Docker daemon:
```bash
# Ensure Docker is running
sudo systemctl status docker
```
## Configuration Issues
### Missing Environment Variables
**Problem:** Required configuration is missing.
**Solutions:**
1. Check your `.env` file:
```bash
# Verify all required variables are set
grep -E "STACK_.*_(NAME|URL|PATH|TOOL)" /opt/composesync/.env
```
2. Check variable syntax:
```bash
# Look for syntax errors
cat -n /opt/composesync/.env
```
### Invalid Paths
**Problem:** Stack paths don't exist or are inaccessible.
**Solutions:**
1. Create missing directories:
```bash
# Create stack directories
sudo mkdir -p /opt/composesync/stacks/stackname
sudo chown YOUR_USERNAME:docker /opt/composesync/stacks/stackname
```
2. Check permissions:
```bash
# Verify directory permissions
# Check ownership
ls -la /opt/composesync/stacks/
# Fix ownership
sudo chown -R $USER:docker /opt/composesync/stacks/
```
## Webhook Issues
### Webhook Notifications Not Sent
**Problem:** Webhook notifications aren't being sent.
3. **Docker Socket Permissions**
```bash
# Check socket permissions
ls -la /var/run/docker.sock
# Fix socket permissions
sudo chmod 666 /var/run/docker.sock
```
**Solutions:**
1. Check webhook URL:
- Add user to docker group
- Fix file ownership and permissions
- Restart Docker daemon if needed
- Check systemd service user configuration
### Lock File Issues
**Symptoms:**
- Service appears stuck
- Updates not running
- Lock file errors
**Diagnosis:**
```bash
# Check for lock files
ls -la /opt/composesync/.lock
# Check lock file age
stat /opt/composesync/.lock
# Check for stale locks
find /opt/composesync -name "*.lock" -mmin +5
```
**Solutions:**
```bash
# Remove stale lock file
sudo rm /opt/composesync/.lock
# Restart service
sudo systemctl restart composesync
# Check if service starts properly
sudo systemctl status composesync
```
### Backup Issues
**Symptoms:**
- No backup files created
- Backup cleanup not working
- Disk space issues
**Diagnosis:**
```bash
# Check backup files
find /opt/composesync/stacks -name "*.bak"
# Check backup retention settings
grep "KEEP_VERSIONS" /opt/composesync/config.toml
# Check disk usage
df -h /opt/composesync/
```
**Common Issues:**
1. **No Backups Created**
```bash
# Verify URL is set
grep NOTIFICATION_WEBHOOK_URL /opt/composesync/.env
# Check if backups are being created
ls -la /opt/composesync/stacks/immich/compose-*.bak
# Check backup creation logs
sudo journalctl -u composesync | grep "backup"
```
2. Test webhook manually:
2. **Backup Cleanup Not Working**
```bash
# Test webhook endpoint
# Check backup retention
grep "KEEP_VERSIONS" /opt/composesync/config.toml
# Count backup files
find /opt/composesync/stacks -name "*.bak" | wc -l
# Manual cleanup
find /opt/composesync/stacks -name "*.bak" -mtime +30 -delete
```
**Solutions:**
- Check backup creation logs
- Verify backup retention settings
- Manually clean old backups
- Check disk space availability
### Webhook Issues
**Symptoms:**
- Webhook notifications not sent
- Webhook delivery failures
- Missing notifications
**Diagnosis:**
```bash
# Check webhook configuration
grep "NOTIFICATION_WEBHOOK_URL" /opt/composesync/config.toml
# Test webhook manually
curl -X POST -H "Content-Type: application/json" \
-d '{"event": "test", "message": "Test notification"}' \
https://your-webhook-url.com/endpoint
# Check webhook logs
sudo journalctl -u composesync | grep "webhook"
```
**Common Issues:**
1. **Invalid Webhook URL**
```toml
# Incorrect webhook URL
[global]
NOTIFICATION_WEBHOOK_URL = "https://invalid-webhook.com/endpoint"
```
2. **Network Issues**
```bash
# Test webhook connectivity
curl -I https://your-webhook-url.com/endpoint
# Check DNS resolution
nslookup your-webhook-url.com
```
3. **Authentication Issues**
```bash
# Test webhook with authentication
curl -X POST -H "Content-Type: application/json" \
-d '{"test": "message"}' \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{"event": "test"}' \
https://your-webhook-url.com/endpoint
```
3. Check network connectivity:
```bash
# Test if webhook URL is accessible
wget -q --spider https://your-webhook-url.com/endpoint
```
## Performance Issues
### High Resource Usage
**Problem:** ComposeSync uses too much CPU or memory.
**Solutions:**
1. Increase update intervals:
```env
UPDATE_INTERVAL_SECONDS=7200 # Check every 2 hours instead of 1
```
- Verify webhook URL is correct
- Check network connectivity
- Configure authentication if required
- Test webhook endpoint manually
2. Reduce version history:
```env
KEEP_VERSIONS=5 # Keep fewer versions
```
## Advanced Troubleshooting
3. Use dry-run mode for testing:
```env
DRY_RUN=true
```
### Debug Mode
### Slow Downloads
**Problem:** Downloads are taking too long.
**Solutions:**
1. Check network connectivity:
```bash
# Test download speed
wget -O /dev/null https://your-url.com/docker-compose.yml
```
2. Consider using Git instead of wget:
```env
STACK_1_TOOL=git
```
## Lock File Issues
### Stale Lock Files
**Problem:** Lock files prevent updates.
**Solutions:**
1. Check for stale locks:
```bash
# Look for lock files
find /opt/composesync/stacks/ -name ".lock" -type d
```
2. Remove stale locks manually:
```bash
# Remove lock file (be careful!)
rm -rf /opt/composesync/stacks/stackname/.lock
```
3. Restart the service:
```bash
sudo systemctl restart composesync
```
## Debugging Tips
### Enable Verbose Logging
For detailed debugging, you can temporarily modify the log function:
Enable debug logging for detailed troubleshooting:
```bash
# Edit the update-agent.sh file
sudo nano /opt/composesync/update-agent.sh
# Edit service file to enable debug
sudo systemctl edit composesync
# Add more verbose logging
log() {
local prefix=""
if [ "$DRY_RUN" = "true" ]; then
prefix="[DRY-RUN] "
fi
echo "[$(date '+%Y-%m-%d %H:%M:%S')] ${prefix}$1" | tee -a /tmp/composesync-debug.log
}
# Add debug environment variable
[Service]
Environment=DEBUG=true
```
### Test Individual Components
### Manual Testing
1. **Test download function:**
```bash
# Test wget download
wget -q -O /tmp/test.yml https://your-url.com/docker-compose.yml
```
Test components manually:
2. **Test Docker Compose:**
```bash
# Test compose file manually
docker compose -f /path/to/docker-compose.yml config
```
```bash
# Test configuration parser
sudo -u composesync /opt/composesync/config-parser.sh
3. **Test webhook:**
```bash
# Test webhook manually
curl -X POST -H "Content-Type: application/json" \
-d '{"event": "test"}' \
$NOTIFICATION_WEBHOOK_URL
```
# Test update script manually
sudo -u composesync /opt/composesync/update-agent.sh
# Test Docker Compose commands
cd /opt/composesync/stacks/immich
docker compose config
docker compose ps
```
### System Information
Gather system information for troubleshooting:
```bash
# System information
uname -a
lsb_release -a
# Docker information
docker version
docker info
# Disk usage
df -h
du -sh /opt/composesync/
# Memory usage
free -h
# Network connectivity
ping -c 3 github.com
curl -I https://github.com
```
## Recovery Procedures
### Manual Rollback
If automatic rollback fails:
```bash
# List available backups
ls -la /opt/composesync/stacks/immich/compose-*.bak
# Restore from backup
sudo cp /opt/composesync/stacks/immich/compose-20240115102001.yml.bak \
/opt/composesync/stacks/immich/docker-compose.yml
# Apply rollback
cd /opt/composesync/stacks/immich
docker compose up -d
```
### Service Recovery
If the service is completely broken:
```bash
# Stop service
sudo systemctl stop composesync
# Backup configuration
sudo cp /opt/composesync/config.toml /opt/composesync/config.toml.backup
# Reinstall service
sudo ./install.sh
# Restore configuration
sudo cp /opt/composesync/config.toml.backup /opt/composesync/config.toml
# Start service
sudo systemctl start composesync
```
### Complete Reset
As a last resort:
```bash
# Stop service
sudo systemctl stop composesync
sudo systemctl disable composesync
# Backup important data
sudo cp -r /opt/composesync/stacks /tmp/composesync-backup
# Remove installation
sudo rm -rf /opt/composesync
sudo rm /etc/systemd/system/composesync.service
# Reinstall
sudo ./install.sh
# Restore stacks
sudo cp -r /tmp/composesync-backup/* /opt/composesync/stacks/
# Start service
sudo systemctl start composesync
```
## Getting Help
If you're still experiencing issues:
### Log Collection
1. **Check the logs:**
```bash
sudo journalctl -u composesync -f
```
Collect logs for troubleshooting:
2. **Enable dry-run mode** to test without making changes:
```env
DRY_RUN=true
```
```bash
# Create log archive
sudo journalctl -u composesync > /tmp/composesync-logs.txt
sudo cat /opt/composesync/config.toml > /tmp/composesync-config.txt
sudo systemctl status composesync > /tmp/composesync-status.txt
3. **Verify your configuration** step by step
# Archive logs
tar -czf composesync-debug.tar.gz /tmp/composesync-*.txt
```
4. **Check the documentation** for your specific use case
### Information to Include
5. **Submit an issue** with:
- Your configuration (with sensitive data removed)
- Relevant log output
- Steps to reproduce the issue
- Expected vs actual behavior
When seeking help, include:
1. **System Information:**
- OS version and distribution
- Docker version
- ComposeSync version
2. **Configuration:**
- Relevant parts of config.toml (remove sensitive data)
- Service status output
3. **Logs:**
- Recent error logs
- Service status logs
- Configuration loading logs
4. **Steps to Reproduce:**
- What you were trying to do
- What happened
- What you expected to happen
### Common Solutions Summary
| Issue | Quick Fix | Detailed Fix |
|-------|-----------|--------------|
| Service won't start | Check config syntax | Validate TOML, check permissions |
| Download failures | Test URL manually | Check network, verify URLs |
| Docker failures | Check compose syntax | Fix compose file, resolve conflicts |
| Permission issues | Add user to docker group | Fix ownership, check socket permissions |
| Lock file stuck | Remove .lock file | Restart service, check for processes |
| No backups | Check retention settings | Verify backup creation, check disk space |
| Webhook failures | Test URL manually | Check network, verify authentication |