This comprehensive session management system tracks and automatically cleans up resources associated with user sessions, preventing resource exhaustion and disk space issues. Key features: - Automatic tracking of all session resources (audio files, temp files, streams) - Per-session resource limits (100 files max, 100MB storage max) - Automatic cleanup of idle sessions (15 minutes) and expired sessions (1 hour) - Background cleanup thread runs every minute - Real-time monitoring via admin endpoints - CLI commands for manual management - Integration with Flask request lifecycle Implementation details: - SessionManager class manages lifecycle of UserSession objects - Each session tracks resources with metadata (type, size, creation time) - Automatic resource eviction when limits are reached (LRU policy) - Orphaned file detection and cleanup - Thread-safe operations with proper locking - Comprehensive metrics and statistics export - Admin API endpoints for monitoring and control Security considerations: - Sessions tied to IP address and user agent - Admin endpoints require authentication - Secure file path handling - Resource limits prevent DoS attacks This addresses the critical issue of temporary file accumulation that could lead to disk exhaustion in production environments. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
366 lines
8.6 KiB
Markdown
366 lines
8.6 KiB
Markdown
# Session Management Documentation
|
|
|
|
This document describes the session management system implemented in Talk2Me to prevent resource leaks from abandoned sessions.
|
|
|
|
## Overview
|
|
|
|
Talk2Me implements a comprehensive session management system that tracks user sessions and associated resources (audio files, temporary files, streams) to ensure proper cleanup and prevent resource exhaustion.
|
|
|
|
## Features
|
|
|
|
### 1. Automatic Resource Tracking
|
|
|
|
All resources created during a user session are automatically tracked:
|
|
- Audio files (uploads and generated)
|
|
- Temporary files
|
|
- Active streams
|
|
- Resource metadata (size, creation time, purpose)
|
|
|
|
### 2. Resource Limits
|
|
|
|
Per-session limits prevent resource exhaustion:
|
|
- Maximum resources per session: 100
|
|
- Maximum storage per session: 100MB
|
|
- Automatic cleanup of oldest resources when limits are reached
|
|
|
|
### 3. Session Lifecycle Management
|
|
|
|
Sessions are automatically managed:
|
|
- Created on first request
|
|
- Updated on each request
|
|
- Cleaned up when idle (15 minutes)
|
|
- Removed when expired (1 hour)
|
|
|
|
### 4. Automatic Cleanup
|
|
|
|
Background cleanup processes run automatically:
|
|
- Idle session cleanup (every minute)
|
|
- Expired session cleanup (every minute)
|
|
- Orphaned file cleanup (every minute)
|
|
|
|
## Configuration
|
|
|
|
Session management can be configured via environment variables or Flask config:
|
|
|
|
```python
|
|
# app.py or config.py
|
|
app.config.update({
|
|
'MAX_SESSION_DURATION': 3600, # 1 hour
|
|
'MAX_SESSION_IDLE_TIME': 900, # 15 minutes
|
|
'MAX_RESOURCES_PER_SESSION': 100,
|
|
'MAX_BYTES_PER_SESSION': 104857600, # 100MB
|
|
'SESSION_CLEANUP_INTERVAL': 60, # 1 minute
|
|
'SESSION_STORAGE_PATH': '/path/to/sessions'
|
|
})
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
### Admin Endpoints
|
|
|
|
All admin endpoints require authentication via `X-Admin-Token` header.
|
|
|
|
#### GET /admin/sessions
|
|
Get information about all active sessions.
|
|
|
|
```bash
|
|
curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions
|
|
```
|
|
|
|
Response:
|
|
```json
|
|
{
|
|
"sessions": [
|
|
{
|
|
"session_id": "uuid",
|
|
"user_id": null,
|
|
"ip_address": "192.168.1.1",
|
|
"created_at": "2024-01-15T10:00:00",
|
|
"last_activity": "2024-01-15T10:05:00",
|
|
"duration_seconds": 300,
|
|
"idle_seconds": 0,
|
|
"request_count": 5,
|
|
"resource_count": 3,
|
|
"total_bytes_used": 1048576,
|
|
"resources": [...]
|
|
}
|
|
],
|
|
"stats": {
|
|
"total_sessions_created": 100,
|
|
"total_sessions_cleaned": 50,
|
|
"active_sessions": 5,
|
|
"avg_session_duration": 600,
|
|
"avg_resources_per_session": 4.2
|
|
}
|
|
}
|
|
```
|
|
|
|
#### GET /admin/sessions/{session_id}
|
|
Get detailed information about a specific session.
|
|
|
|
```bash
|
|
curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/abc123
|
|
```
|
|
|
|
#### POST /admin/sessions/{session_id}/cleanup
|
|
Manually cleanup a specific session.
|
|
|
|
```bash
|
|
curl -X POST -H "X-Admin-Token: your-token" \
|
|
http://localhost:5005/admin/sessions/abc123/cleanup
|
|
```
|
|
|
|
#### GET /admin/sessions/metrics
|
|
Get session management metrics for monitoring.
|
|
|
|
```bash
|
|
curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/metrics
|
|
```
|
|
|
|
Response:
|
|
```json
|
|
{
|
|
"sessions": {
|
|
"active": 5,
|
|
"total_created": 100,
|
|
"total_cleaned": 95
|
|
},
|
|
"resources": {
|
|
"active": 20,
|
|
"total_cleaned": 380,
|
|
"active_bytes": 10485760,
|
|
"total_bytes_cleaned": 1073741824
|
|
},
|
|
"limits": {
|
|
"max_session_duration": 3600,
|
|
"max_idle_time": 900,
|
|
"max_resources_per_session": 100,
|
|
"max_bytes_per_session": 104857600
|
|
}
|
|
}
|
|
```
|
|
|
|
## CLI Commands
|
|
|
|
Session management can be controlled via Flask CLI commands:
|
|
|
|
```bash
|
|
# List all active sessions
|
|
flask sessions-list
|
|
|
|
# Manual cleanup
|
|
flask sessions-cleanup
|
|
|
|
# Show statistics
|
|
flask sessions-stats
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### 1. Monitor Active Sessions
|
|
|
|
```python
|
|
import requests
|
|
|
|
headers = {'X-Admin-Token': 'your-admin-token'}
|
|
response = requests.get('http://localhost:5005/admin/sessions', headers=headers)
|
|
sessions = response.json()
|
|
|
|
for session in sessions['sessions']:
|
|
print(f"Session {session['session_id']}:")
|
|
print(f" IP: {session['ip_address']}")
|
|
print(f" Resources: {session['resource_count']}")
|
|
print(f" Storage: {session['total_bytes_used'] / 1024 / 1024:.2f} MB")
|
|
```
|
|
|
|
### 2. Cleanup Idle Sessions
|
|
|
|
```python
|
|
# Get all sessions
|
|
response = requests.get('http://localhost:5005/admin/sessions', headers=headers)
|
|
sessions = response.json()['sessions']
|
|
|
|
# Find idle sessions
|
|
idle_threshold = 300 # 5 minutes
|
|
for session in sessions:
|
|
if session['idle_seconds'] > idle_threshold:
|
|
# Cleanup idle session
|
|
cleanup_url = f'http://localhost:5005/admin/sessions/{session["session_id"]}/cleanup'
|
|
requests.post(cleanup_url, headers=headers)
|
|
print(f"Cleaned up idle session {session['session_id']}")
|
|
```
|
|
|
|
### 3. Monitor Resource Usage
|
|
|
|
```python
|
|
# Get metrics
|
|
response = requests.get('http://localhost:5005/admin/sessions/metrics', headers=headers)
|
|
metrics = response.json()
|
|
|
|
print(f"Active sessions: {metrics['sessions']['active']}")
|
|
print(f"Active resources: {metrics['resources']['active']}")
|
|
print(f"Storage used: {metrics['resources']['active_bytes'] / 1024 / 1024:.2f} MB")
|
|
print(f"Total cleaned: {metrics['resources']['total_bytes_cleaned'] / 1024 / 1024 / 1024:.2f} GB")
|
|
```
|
|
|
|
## Resource Types
|
|
|
|
The session manager tracks different types of resources:
|
|
|
|
### 1. Audio Files
|
|
- Uploaded audio files for transcription
|
|
- Generated audio files from TTS
|
|
- Automatically cleaned up after session ends
|
|
|
|
### 2. Temporary Files
|
|
- Processing intermediates
|
|
- Cache files
|
|
- Automatically cleaned up after use
|
|
|
|
### 3. Streams
|
|
- WebSocket connections
|
|
- Server-sent event streams
|
|
- Closed when session ends
|
|
|
|
## Best Practices
|
|
|
|
### 1. Session Configuration
|
|
|
|
```python
|
|
# Development
|
|
app.config.update({
|
|
'MAX_SESSION_DURATION': 7200, # 2 hours
|
|
'MAX_SESSION_IDLE_TIME': 1800, # 30 minutes
|
|
'MAX_RESOURCES_PER_SESSION': 200,
|
|
'MAX_BYTES_PER_SESSION': 209715200 # 200MB
|
|
})
|
|
|
|
# Production
|
|
app.config.update({
|
|
'MAX_SESSION_DURATION': 3600, # 1 hour
|
|
'MAX_SESSION_IDLE_TIME': 900, # 15 minutes
|
|
'MAX_RESOURCES_PER_SESSION': 100,
|
|
'MAX_BYTES_PER_SESSION': 104857600 # 100MB
|
|
})
|
|
```
|
|
|
|
### 2. Monitoring
|
|
|
|
Set up monitoring for:
|
|
- Number of active sessions
|
|
- Resource usage per session
|
|
- Cleanup frequency
|
|
- Failed cleanup attempts
|
|
|
|
### 3. Alerting
|
|
|
|
Configure alerts for:
|
|
- High number of active sessions (>1000)
|
|
- High resource usage (>80% of limits)
|
|
- Failed cleanup operations
|
|
- Orphaned files detected
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
#### 1. Sessions Not Being Cleaned Up
|
|
|
|
Check cleanup thread status:
|
|
```bash
|
|
flask sessions-stats
|
|
```
|
|
|
|
Manual cleanup:
|
|
```bash
|
|
flask sessions-cleanup
|
|
```
|
|
|
|
#### 2. Resource Limits Reached
|
|
|
|
Check session details:
|
|
```bash
|
|
curl -H "X-Admin-Token: token" http://localhost:5005/admin/sessions/SESSION_ID
|
|
```
|
|
|
|
Increase limits if needed:
|
|
```python
|
|
app.config['MAX_RESOURCES_PER_SESSION'] = 200
|
|
app.config['MAX_BYTES_PER_SESSION'] = 209715200 # 200MB
|
|
```
|
|
|
|
#### 3. Orphaned Files
|
|
|
|
Check for orphaned files:
|
|
```bash
|
|
ls -la /path/to/session/storage/
|
|
```
|
|
|
|
Clean orphaned files:
|
|
```bash
|
|
flask sessions-cleanup
|
|
```
|
|
|
|
### Debug Logging
|
|
|
|
Enable debug logging for session management:
|
|
|
|
```python
|
|
import logging
|
|
|
|
# Enable session manager debug logs
|
|
logging.getLogger('session_manager').setLevel(logging.DEBUG)
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
1. **Session Hijacking**: Sessions are tied to IP addresses and user agents
|
|
2. **Resource Exhaustion**: Strict per-session limits prevent DoS attacks
|
|
3. **File System Access**: Session storage uses secure paths and permissions
|
|
4. **Admin Access**: All admin endpoints require authentication
|
|
|
|
## Performance Impact
|
|
|
|
The session management system has minimal performance impact:
|
|
- Memory: ~1KB per session + resource metadata
|
|
- CPU: Background cleanup runs every minute
|
|
- Disk I/O: Cleanup operations are batched
|
|
- Network: No external dependencies
|
|
|
|
## Integration with Other Systems
|
|
|
|
### Rate Limiting
|
|
|
|
Session management integrates with rate limiting:
|
|
```python
|
|
# Sessions are automatically tracked per IP
|
|
# Rate limits apply per session
|
|
```
|
|
|
|
### Secrets Management
|
|
|
|
Session tokens can be encrypted:
|
|
```python
|
|
from secrets_manager import encrypt_value
|
|
encrypted_session = encrypt_value(session_id)
|
|
```
|
|
|
|
### Monitoring
|
|
|
|
Export metrics to monitoring systems:
|
|
```python
|
|
# Prometheus format
|
|
@app.route('/metrics')
|
|
def prometheus_metrics():
|
|
metrics = app.session_manager.export_metrics()
|
|
# Format as Prometheus metrics
|
|
return format_prometheus(metrics)
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Session Persistence**: Store sessions in Redis/database
|
|
2. **Distributed Sessions**: Support for multi-server deployments
|
|
3. **Session Analytics**: Track usage patterns and trends
|
|
4. **Resource Quotas**: Per-user resource quotas
|
|
5. **Session Replay**: Debug issues by replaying sessions |