🔧 Common Issues & Solutions¶
Häufige Probleme und deren Lösungen bei Keiko Personal Assistant.
🚀 Startup-Probleme¶
Application startet nicht¶
Problem: Keiko-Service startet nicht oder stürzt sofort ab.
Symptome:
systemctl status keiko-api
● keiko-api.service - Keiko Personal Assistant API
Loaded: loaded (/etc/systemd/system/keiko-api.service; enabled)
Active: failed (Result: exit-code)
Diagnose:
# Service-Logs prüfen
journalctl -u keiko-api -f
# Application-Logs prüfen
tail -f /var/log/keiko/app.log
# Konfiguration validieren
./scripts/validate-config.sh
Häufige Ursachen & Lösungen:
Fehlende Dependencies
Database-Verbindungsfehler
Port bereits belegt
Berechtigungsprobleme
Langsamer Startup¶
Problem: Application startet sehr langsam (>60 Sekunden).
Diagnose:
# Startup-Zeit messen
time systemctl start keiko-api
# Startup-Profiling aktivieren
export KEIKO_PROFILE_STARTUP=true
Lösungen:
Database-Connection-Pool optimieren
Lazy-Loading aktivieren
Health-Check-Timeout erhöhen
🗄️ Database-Probleme¶
Connection-Pool-Erschöpfung¶
Problem: "QueuePool limit of size X overflow Y reached"
Symptome:
Diagnose:
# Aktive Verbindungen prüfen
psql -d keiko_db -c "SELECT count(*) FROM pg_stat_activity WHERE datname='keiko_db';"
# Connection-Pool-Status
curl http://localhost:8000/debug/pool-status
Lösungen:
Pool-Größe erhöhen
Connection-Leaks finden
# debug/connection_tracker.py import logging from sqlalchemy import event from sqlalchemy.engine import Engine @event.listens_for(Engine, "connect") def set_sqlite_pragma(dbapi_connection, connection_record): logging.info(f"New connection: {id(dbapi_connection)}") @event.listens_for(Engine, "close") def close_connection(dbapi_connection, connection_record): logging.info(f"Closed connection: {id(dbapi_connection)}")
Connection-Recycling konfigurieren
Slow Queries¶
Problem: Database-Queries sind langsam (>1 Sekunde).
Diagnose:
-- Langsame Queries identifizieren
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
WHERE mean_exec_time > 1000
ORDER BY mean_exec_time DESC;
-- Aktive Queries prüfen
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
Lösungen:
Fehlende Indizes hinzufügen
Query-Optimierung
Database-Tuning
🔄 Redis-Probleme¶
Redis-Verbindungsfehler¶
Problem: "Connection refused" oder "Redis server went away"
Diagnose:
# Redis-Status prüfen
redis-cli ping
# Redis-Logs prüfen
tail -f /var/log/redis/redis-server.log
# Verbindung testen
redis-cli -h localhost -p 6379 info
Lösungen:
Redis-Service starten
Redis-Konfiguration prüfen
Connection-Pool konfigurieren
Memory-Probleme¶
Problem: Redis läuft aus dem Speicher.
Diagnose:
# Redis-Memory-Usage
redis-cli info memory
# Top-Keys nach Speicherverbrauch
redis-cli --bigkeys
# Memory-Usage-Pattern
redis-cli info stats | grep keyspace
Lösungen:
Memory-Policy konfigurieren
Key-Expiration setzen
Memory-Monitoring
🤖 Agent-Probleme¶
Agent startet nicht¶
Problem: Agent kann nicht gestartet oder aktiviert werden.
Diagnose:
# Agent-Status prüfen
curl http://localhost:8000/api/v1/agents/{agent_id}/status
# Agent-Logs prüfen
grep "agent_id:{agent_id}" /var/log/keiko/app.log
# Agent-Konfiguration validieren
./scripts/validate-agent-config.sh {agent_id}
Lösungen:
Konfigurationsfehler beheben
Dependencies prüfen
Resource-Limits prüfen
Task-Execution-Fehler¶
Problem: Tasks schlagen fehl oder hängen.
Diagnose:
# Fehlgeschlagene Tasks
curl http://localhost:8000/api/v1/tasks?status=failed
# Hängende Tasks
curl http://localhost:8000/api/v1/tasks?status=running | jq '.[] | select(.created_at < (now - 3600))'
# Task-Logs
grep "task_id:{task_id}" /var/log/keiko/app.log
Lösungen:
Timeout-Konfiguration
Error-Handling verbessern
async def execute_task_with_error_handling(task): try: result = await agent.execute_task(task) return result except TimeoutError: logger.error(f"Task {task.id} timed out") return TaskResult.failure("Task timed out") except Exception as e: logger.error(f"Task {task.id} failed: {e}", exc_info=True) return TaskResult.failure(str(e))
Resource-Monitoring
# Task-Resource-Monitoring async def monitor_task_resources(task_id): process = psutil.Process() while task_is_running(task_id): memory_usage = process.memory_info().rss / 1024 / 1024 # MB cpu_usage = process.cpu_percent() if memory_usage > 1000: # 1GB logger.warning(f"Task {task_id} high memory usage: {memory_usage}MB") if cpu_usage > 90: logger.warning(f"Task {task_id} high CPU usage: {cpu_usage}%") await asyncio.sleep(10)
🌐 API-Probleme¶
500 Internal Server Error¶
Problem: API-Endpunkte geben 500-Fehler zurück.
Diagnose:
# Error-Logs prüfen
tail -f /var/log/keiko/error.log
# API-Health-Check
curl http://localhost:8000/health
# Specific-Endpoint testen
curl -v http://localhost:8000/api/v1/agents
Lösungen:
Exception-Handling prüfen
# Unbehandelte Exceptions finden @app.exception_handler(Exception) async def general_exception_handler(request: Request, exc: Exception): logger.error(f"Unhandled exception: {exc}", exc_info=True) return JSONResponse( status_code=500, content={"error": "Internal server error", "request_id": str(uuid.uuid4())} )
Dependency-Injection-Probleme
Rate-Limiting-Probleme¶
Problem: "Too Many Requests" (429) Fehler.
Diagnose:
# Rate-Limit-Status prüfen
curl -I http://localhost:8000/api/v1/agents
# Redis-Rate-Limit-Keys prüfen
redis-cli keys "rate_limit:*"
Lösungen:
Rate-Limits anpassen
Client-seitige Retry-Logic
import asyncio from aiohttp import ClientSession async def api_request_with_retry(url, max_retries=3): for attempt in range(max_retries): async with ClientSession() as session: async with session.get(url) as response: if response.status == 429: retry_after = int(response.headers.get('Retry-After', 60)) await asyncio.sleep(retry_after) continue return await response.json() raise Exception("Max retries exceeded")
📊 Performance-Probleme¶
Hohe Response-Zeiten¶
Problem: API-Responses sind langsam (>2 Sekunden).
Diagnose:
# Response-Zeit messen
curl -w "@curl-format.txt" -o /dev/null -s http://localhost:8000/api/v1/agents
# APM-Metriken prüfen
curl http://localhost:8000/metrics | grep http_request_duration
Lösungen:
Caching implementieren
Database-Query-Optimierung
Async-Optimierung
Debugging-Tools
Nutzen Sie die integrierten Debugging-Tools: - /debug/health
- Detaillierte Health-Informationen - /debug/metrics
- Performance-Metriken - /debug/config
- Aktuelle Konfiguration - /debug/logs
- Recent-Log-Entries
Support-Kanäle
Bei persistenten Problemen: - GitHub Issues: https://github.com/oscharko/keiko-personal-assistant/issues - Community-Forum: https://community.keiko.ai - Enterprise-Support: support@keiko.ai