# Session Management Documentation This document describes the session management system implemented in Talk2Me to prevent resource leaks from abandoned sessions. ## Overview Talk2Me implements a comprehensive session management system that tracks user sessions and associated resources (audio files, temporary files, streams) to ensure proper cleanup and prevent resource exhaustion. ## Features ### 1. Automatic Resource Tracking All resources created during a user session are automatically tracked: - Audio files (uploads and generated) - Temporary files - Active streams - Resource metadata (size, creation time, purpose) ### 2. Resource Limits Per-session limits prevent resource exhaustion: - Maximum resources per session: 100 - Maximum storage per session: 100MB - Automatic cleanup of oldest resources when limits are reached ### 3. Session Lifecycle Management Sessions are automatically managed: - Created on first request - Updated on each request - Cleaned up when idle (15 minutes) - Removed when expired (1 hour) ### 4. Automatic Cleanup Background cleanup processes run automatically: - Idle session cleanup (every minute) - Expired session cleanup (every minute) - Orphaned file cleanup (every minute) ## Configuration Session management can be configured via environment variables or Flask config: ```python # app.py or config.py app.config.update({ 'MAX_SESSION_DURATION': 3600, # 1 hour 'MAX_SESSION_IDLE_TIME': 900, # 15 minutes 'MAX_RESOURCES_PER_SESSION': 100, 'MAX_BYTES_PER_SESSION': 104857600, # 100MB 'SESSION_CLEANUP_INTERVAL': 60, # 1 minute 'SESSION_STORAGE_PATH': '/path/to/sessions' }) ``` ## API Endpoints ### Admin Endpoints All admin endpoints require authentication via `X-Admin-Token` header. #### GET /admin/sessions Get information about all active sessions. ```bash curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions ``` Response: ```json { "sessions": [ { "session_id": "uuid", "user_id": null, "ip_address": "192.168.1.1", "created_at": "2024-01-15T10:00:00", "last_activity": "2024-01-15T10:05:00", "duration_seconds": 300, "idle_seconds": 0, "request_count": 5, "resource_count": 3, "total_bytes_used": 1048576, "resources": [...] } ], "stats": { "total_sessions_created": 100, "total_sessions_cleaned": 50, "active_sessions": 5, "avg_session_duration": 600, "avg_resources_per_session": 4.2 } } ``` #### GET /admin/sessions/{session_id} Get detailed information about a specific session. ```bash curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/abc123 ``` #### POST /admin/sessions/{session_id}/cleanup Manually cleanup a specific session. ```bash curl -X POST -H "X-Admin-Token: your-token" \ http://localhost:5005/admin/sessions/abc123/cleanup ``` #### GET /admin/sessions/metrics Get session management metrics for monitoring. ```bash curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/metrics ``` Response: ```json { "sessions": { "active": 5, "total_created": 100, "total_cleaned": 95 }, "resources": { "active": 20, "total_cleaned": 380, "active_bytes": 10485760, "total_bytes_cleaned": 1073741824 }, "limits": { "max_session_duration": 3600, "max_idle_time": 900, "max_resources_per_session": 100, "max_bytes_per_session": 104857600 } } ``` ## CLI Commands Session management can be controlled via Flask CLI commands: ```bash # List all active sessions flask sessions-list # Manual cleanup flask sessions-cleanup # Show statistics flask sessions-stats ``` ## Usage Examples ### 1. Monitor Active Sessions ```python import requests headers = {'X-Admin-Token': 'your-admin-token'} response = requests.get('http://localhost:5005/admin/sessions', headers=headers) sessions = response.json() for session in sessions['sessions']: print(f"Session {session['session_id']}:") print(f" IP: {session['ip_address']}") print(f" Resources: {session['resource_count']}") print(f" Storage: {session['total_bytes_used'] / 1024 / 1024:.2f} MB") ``` ### 2. Cleanup Idle Sessions ```python # Get all sessions response = requests.get('http://localhost:5005/admin/sessions', headers=headers) sessions = response.json()['sessions'] # Find idle sessions idle_threshold = 300 # 5 minutes for session in sessions: if session['idle_seconds'] > idle_threshold: # Cleanup idle session cleanup_url = f'http://localhost:5005/admin/sessions/{session["session_id"]}/cleanup' requests.post(cleanup_url, headers=headers) print(f"Cleaned up idle session {session['session_id']}") ``` ### 3. Monitor Resource Usage ```python # Get metrics response = requests.get('http://localhost:5005/admin/sessions/metrics', headers=headers) metrics = response.json() print(f"Active sessions: {metrics['sessions']['active']}") print(f"Active resources: {metrics['resources']['active']}") print(f"Storage used: {metrics['resources']['active_bytes'] / 1024 / 1024:.2f} MB") print(f"Total cleaned: {metrics['resources']['total_bytes_cleaned'] / 1024 / 1024 / 1024:.2f} GB") ``` ## Resource Types The session manager tracks different types of resources: ### 1. Audio Files - Uploaded audio files for transcription - Generated audio files from TTS - Automatically cleaned up after session ends ### 2. Temporary Files - Processing intermediates - Cache files - Automatically cleaned up after use ### 3. Streams - WebSocket connections - Server-sent event streams - Closed when session ends ## Best Practices ### 1. Session Configuration ```python # Development app.config.update({ 'MAX_SESSION_DURATION': 7200, # 2 hours 'MAX_SESSION_IDLE_TIME': 1800, # 30 minutes 'MAX_RESOURCES_PER_SESSION': 200, 'MAX_BYTES_PER_SESSION': 209715200 # 200MB }) # Production app.config.update({ 'MAX_SESSION_DURATION': 3600, # 1 hour 'MAX_SESSION_IDLE_TIME': 900, # 15 minutes 'MAX_RESOURCES_PER_SESSION': 100, 'MAX_BYTES_PER_SESSION': 104857600 # 100MB }) ``` ### 2. Monitoring Set up monitoring for: - Number of active sessions - Resource usage per session - Cleanup frequency - Failed cleanup attempts ### 3. Alerting Configure alerts for: - High number of active sessions (>1000) - High resource usage (>80% of limits) - Failed cleanup operations - Orphaned files detected ## Troubleshooting ### Common Issues #### 1. Sessions Not Being Cleaned Up Check cleanup thread status: ```bash flask sessions-stats ``` Manual cleanup: ```bash flask sessions-cleanup ``` #### 2. Resource Limits Reached Check session details: ```bash curl -H "X-Admin-Token: token" http://localhost:5005/admin/sessions/SESSION_ID ``` Increase limits if needed: ```python app.config['MAX_RESOURCES_PER_SESSION'] = 200 app.config['MAX_BYTES_PER_SESSION'] = 209715200 # 200MB ``` #### 3. Orphaned Files Check for orphaned files: ```bash ls -la /path/to/session/storage/ ``` Clean orphaned files: ```bash flask sessions-cleanup ``` ### Debug Logging Enable debug logging for session management: ```python import logging # Enable session manager debug logs logging.getLogger('session_manager').setLevel(logging.DEBUG) ``` ## Security Considerations 1. **Session Hijacking**: Sessions are tied to IP addresses and user agents 2. **Resource Exhaustion**: Strict per-session limits prevent DoS attacks 3. **File System Access**: Session storage uses secure paths and permissions 4. **Admin Access**: All admin endpoints require authentication ## Performance Impact The session management system has minimal performance impact: - Memory: ~1KB per session + resource metadata - CPU: Background cleanup runs every minute - Disk I/O: Cleanup operations are batched - Network: No external dependencies ## Integration with Other Systems ### Rate Limiting Session management integrates with rate limiting: ```python # Sessions are automatically tracked per IP # Rate limits apply per session ``` ### Secrets Management Session tokens can be encrypted: ```python from secrets_manager import encrypt_value encrypted_session = encrypt_value(session_id) ``` ### Monitoring Export metrics to monitoring systems: ```python # Prometheus format @app.route('/metrics') def prometheus_metrics(): metrics = app.session_manager.export_metrics() # Format as Prometheus metrics return format_prometheus(metrics) ``` ## Future Enhancements 1. **Session Persistence**: Store sessions in Redis/database 2. **Distributed Sessions**: Support for multi-server deployments 3. **Session Analytics**: Track usage patterns and trends 4. **Resource Quotas**: Per-user resource quotas 5. **Session Replay**: Debug issues by replaying sessions