# Request Size Limits Documentation This document describes the request size limiting system implemented in Talk2Me to prevent memory exhaustion from large uploads. ## Overview Talk2Me implements comprehensive request size limiting to protect against: - Memory exhaustion from large file uploads - Denial of Service (DoS) attacks using oversized requests - Buffer overflow attempts - Resource starvation from unbounded requests ## Default Limits ### Global Limits - **Maximum Content Length**: 50MB - Absolute maximum for any request - **Maximum Audio File Size**: 25MB - For audio uploads (transcription) - **Maximum JSON Payload**: 1MB - For API requests - **Maximum Image Size**: 10MB - For future image processing features - **Maximum Chunk Size**: 1MB - For streaming uploads ## Features ### 1. Multi-Layer Protection The system implements multiple layers of size checking: - Flask's built-in `MAX_CONTENT_LENGTH` configuration - Pre-request validation before data is loaded into memory - File-type specific limits - Endpoint-specific limits - Streaming request monitoring ### 2. File Type Detection Automatic detection and enforcement based on file extensions: - Audio files: `.wav`, `.mp3`, `.ogg`, `.webm`, `.m4a`, `.flac`, `.aac` - Image files: `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp` - JSON payloads: Content-Type header detection ### 3. Graceful Error Handling When limits are exceeded: - Returns 413 (Request Entity Too Large) status code - Provides clear error messages with size information - Includes both actual and allowed sizes - Human-readable size formatting ## Configuration ### Environment Variables ```bash # Set limits via environment variables (in bytes) export MAX_CONTENT_LENGTH=52428800 # 50MB export MAX_AUDIO_SIZE=26214400 # 25MB export MAX_JSON_SIZE=1048576 # 1MB export MAX_IMAGE_SIZE=10485760 # 10MB ``` ### Flask Configuration ```python # In config.py or app.py app.config.update({ 'MAX_CONTENT_LENGTH': 50 * 1024 * 1024, # 50MB 'MAX_AUDIO_SIZE': 25 * 1024 * 1024, # 25MB 'MAX_JSON_SIZE': 1 * 1024 * 1024, # 1MB 'MAX_IMAGE_SIZE': 10 * 1024 * 1024 # 10MB }) ``` ### Dynamic Configuration Size limits can be updated at runtime via admin API. ## API Endpoints ### GET /admin/size-limits Get current size limits. ```bash curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/size-limits ``` Response: ```json { "limits": { "max_content_length": 52428800, "max_audio_size": 26214400, "max_json_size": 1048576, "max_image_size": 10485760 }, "limits_human": { "max_content_length": "50.0MB", "max_audio_size": "25.0MB", "max_json_size": "1.0MB", "max_image_size": "10.0MB" } } ``` ### POST /admin/size-limits Update size limits dynamically. ```bash curl -X POST -H "X-Admin-Token: your-token" \ -H "Content-Type: application/json" \ -d '{"max_audio_size": "30MB", "max_json_size": 2097152}' \ http://localhost:5005/admin/size-limits ``` Response: ```json { "success": true, "old_limits": {...}, "new_limits": {...}, "new_limits_human": { "max_audio_size": "30.0MB", "max_json_size": "2.0MB" } } ``` ## Usage Examples ### 1. Endpoint-Specific Limits ```python @app.route('/upload') @limit_request_size(max_size=10*1024*1024) # 10MB limit def upload(): # Handle upload pass @app.route('/upload-audio') @limit_request_size(max_audio_size=30*1024*1024) # 30MB for audio def upload_audio(): # Handle audio upload pass ``` ### 2. Client-Side Validation ```javascript // Check file size before upload const MAX_AUDIO_SIZE = 25 * 1024 * 1024; // 25MB function validateAudioFile(file) { if (file.size > MAX_AUDIO_SIZE) { alert(`Audio file too large. Maximum size is ${MAX_AUDIO_SIZE / 1024 / 1024}MB`); return false; } return true; } ``` ### 3. Chunked Uploads (Future Enhancement) ```javascript // For files larger than limits, use chunked upload async function uploadLargeFile(file, chunkSize = 1024 * 1024) { const chunks = Math.ceil(file.size / chunkSize); for (let i = 0; i < chunks; i++) { const start = i * chunkSize; const end = Math.min(start + chunkSize, file.size); const chunk = file.slice(start, end); await uploadChunk(chunk, i, chunks); } } ``` ## Error Responses ### 413 Request Entity Too Large When a request exceeds size limits: ```json { "error": "Request too large", "max_size": 52428800, "your_size": 75000000, "max_size_mb": 50.0 } ``` ### File-Specific Errors For audio files: ```json { "error": "Audio file too large", "max_size": 26214400, "your_size": 35000000, "max_size_mb": 25.0 } ``` For JSON payloads: ```json { "error": "JSON payload too large", "max_size": 1048576, "your_size": 2000000, "max_size_kb": 1024.0 } ``` ## Best Practices ### 1. Client-Side Validation Always validate file sizes on the client side: ```javascript // Add to static/js/app.js const SIZE_LIMITS = { audio: 25 * 1024 * 1024, // 25MB json: 1 * 1024 * 1024, // 1MB }; function checkFileSize(file, type) { const limit = SIZE_LIMITS[type]; if (file.size > limit) { showError(`File too large. Maximum size: ${formatSize(limit)}`); return false; } return true; } ``` ### 2. Progressive Enhancement For better UX with large files: - Show upload progress - Implement resumable uploads - Compress audio client-side when possible - Use appropriate audio formats (WebM/Opus for smaller sizes) ### 3. Server Configuration Configure your web server (Nginx/Apache) to also enforce limits: **Nginx:** ```nginx client_max_body_size 50M; client_body_buffer_size 1M; ``` **Apache:** ```apache LimitRequestBody 52428800 ``` ### 4. Monitoring Monitor size limit violations: - Track 413 errors in logs - Alert on repeated violations from same IP - Adjust limits based on usage patterns ## Security Considerations 1. **Memory Protection**: Pre-flight size checks prevent loading large files into memory 2. **DoS Prevention**: Limits prevent attackers from exhausting server resources 3. **Bandwidth Protection**: Prevents bandwidth exhaustion from large uploads 4. **Storage Protection**: Works with session management to limit total storage per user ## Integration with Other Systems ### Rate Limiting Size limits work in conjunction with rate limiting: - Large requests count more against rate limits - Repeated size violations can trigger IP blocking ### Session Management Size limits are enforced per session: - Total storage per session is limited - Large files count against session resource limits ### Monitoring Size limit violations are tracked in: - Application logs - Health check endpoints - Admin monitoring dashboards ## Troubleshooting ### Common Issues #### 1. Legitimate Large Files Rejected If users need to upload larger files: ```bash # Increase limit for audio files to 50MB curl -X POST -H "X-Admin-Token: token" \ -d '{"max_audio_size": "50MB"}' \ http://localhost:5005/admin/size-limits ``` #### 2. Chunked Transfer Encoding For requests without Content-Length header: - The system monitors the stream - Terminates connection if size exceeded - May require special handling for some clients #### 3. Load Balancer Limits Ensure your load balancer also enforces appropriate limits: - AWS ALB: Configure request size limits - Cloudflare: Set upload size limits - Nginx: Configure client_max_body_size ## Performance Impact The size limiting system has minimal performance impact: - Pre-flight checks are O(1) operations - No buffering of large requests - Early termination of oversized requests - Efficient memory usage ## Future Enhancements 1. **Chunked Upload Support**: Native support for resumable uploads 2. **Compression Detection**: Automatic handling of compressed uploads 3. **Dynamic Limits**: Per-user or per-tier size limits 4. **Bandwidth Throttling**: Rate limit large uploads 5. **Storage Quotas**: Long-term storage limits per user