talk2me/REQUEST_SIZE_LIMITS.md
Adolfo Delorenzo aec2d3b0aa Add request size limits - Prevents memory exhaustion from large uploads
This comprehensive request size limiting system prevents memory exhaustion and DoS attacks from oversized requests.

Key features:
- Global request size limit: 50MB (configurable)
- Type-specific limits: 25MB for audio, 1MB for JSON, 10MB for images
- Multi-layer validation before loading data into memory
- File type detection based on extensions
- Endpoint-specific limit enforcement
- Dynamic configuration via admin API
- Clear error messages with size information

Implementation details:
- RequestSizeLimiter middleware with Flask integration
- Pre-request validation using Content-Length header
- File size checking for multipart uploads
- JSON payload size validation
- Custom decorator for route-specific limits
- StreamSizeLimiter for chunked transfers
- Integration with Flask's MAX_CONTENT_LENGTH

Admin features:
- GET /admin/size-limits - View current limits
- POST /admin/size-limits - Update limits dynamically
- Human-readable size formatting in responses
- Size limit info in health check endpoints

Security benefits:
- Prevents memory exhaustion attacks
- Blocks oversized uploads before processing
- Protects against buffer overflow attempts
- Works with rate limiting for comprehensive protection

This addresses the critical security issue of unbounded request sizes that could lead to memory exhaustion or system crashes.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 00:58:14 -06:00

332 lines
7.9 KiB
Markdown

# Request Size Limits Documentation
This document describes the request size limiting system implemented in Talk2Me to prevent memory exhaustion from large uploads.
## Overview
Talk2Me implements comprehensive request size limiting to protect against:
- Memory exhaustion from large file uploads
- Denial of Service (DoS) attacks using oversized requests
- Buffer overflow attempts
- Resource starvation from unbounded requests
## Default Limits
### Global Limits
- **Maximum Content Length**: 50MB - Absolute maximum for any request
- **Maximum Audio File Size**: 25MB - For audio uploads (transcription)
- **Maximum JSON Payload**: 1MB - For API requests
- **Maximum Image Size**: 10MB - For future image processing features
- **Maximum Chunk Size**: 1MB - For streaming uploads
## Features
### 1. Multi-Layer Protection
The system implements multiple layers of size checking:
- Flask's built-in `MAX_CONTENT_LENGTH` configuration
- Pre-request validation before data is loaded into memory
- File-type specific limits
- Endpoint-specific limits
- Streaming request monitoring
### 2. File Type Detection
Automatic detection and enforcement based on file extensions:
- Audio files: `.wav`, `.mp3`, `.ogg`, `.webm`, `.m4a`, `.flac`, `.aac`
- Image files: `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp`
- JSON payloads: Content-Type header detection
### 3. Graceful Error Handling
When limits are exceeded:
- Returns 413 (Request Entity Too Large) status code
- Provides clear error messages with size information
- Includes both actual and allowed sizes
- Human-readable size formatting
## Configuration
### Environment Variables
```bash
# Set limits via environment variables (in bytes)
export MAX_CONTENT_LENGTH=52428800 # 50MB
export MAX_AUDIO_SIZE=26214400 # 25MB
export MAX_JSON_SIZE=1048576 # 1MB
export MAX_IMAGE_SIZE=10485760 # 10MB
```
### Flask Configuration
```python
# In config.py or app.py
app.config.update({
'MAX_CONTENT_LENGTH': 50 * 1024 * 1024, # 50MB
'MAX_AUDIO_SIZE': 25 * 1024 * 1024, # 25MB
'MAX_JSON_SIZE': 1 * 1024 * 1024, # 1MB
'MAX_IMAGE_SIZE': 10 * 1024 * 1024 # 10MB
})
```
### Dynamic Configuration
Size limits can be updated at runtime via admin API.
## API Endpoints
### GET /admin/size-limits
Get current size limits.
```bash
curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/size-limits
```
Response:
```json
{
"limits": {
"max_content_length": 52428800,
"max_audio_size": 26214400,
"max_json_size": 1048576,
"max_image_size": 10485760
},
"limits_human": {
"max_content_length": "50.0MB",
"max_audio_size": "25.0MB",
"max_json_size": "1.0MB",
"max_image_size": "10.0MB"
}
}
```
### POST /admin/size-limits
Update size limits dynamically.
```bash
curl -X POST -H "X-Admin-Token: your-token" \
-H "Content-Type: application/json" \
-d '{"max_audio_size": "30MB", "max_json_size": 2097152}' \
http://localhost:5005/admin/size-limits
```
Response:
```json
{
"success": true,
"old_limits": {...},
"new_limits": {...},
"new_limits_human": {
"max_audio_size": "30.0MB",
"max_json_size": "2.0MB"
}
}
```
## Usage Examples
### 1. Endpoint-Specific Limits
```python
@app.route('/upload')
@limit_request_size(max_size=10*1024*1024) # 10MB limit
def upload():
# Handle upload
pass
@app.route('/upload-audio')
@limit_request_size(max_audio_size=30*1024*1024) # 30MB for audio
def upload_audio():
# Handle audio upload
pass
```
### 2. Client-Side Validation
```javascript
// Check file size before upload
const MAX_AUDIO_SIZE = 25 * 1024 * 1024; // 25MB
function validateAudioFile(file) {
if (file.size > MAX_AUDIO_SIZE) {
alert(`Audio file too large. Maximum size is ${MAX_AUDIO_SIZE / 1024 / 1024}MB`);
return false;
}
return true;
}
```
### 3. Chunked Uploads (Future Enhancement)
```javascript
// For files larger than limits, use chunked upload
async function uploadLargeFile(file, chunkSize = 1024 * 1024) {
const chunks = Math.ceil(file.size / chunkSize);
for (let i = 0; i < chunks; i++) {
const start = i * chunkSize;
const end = Math.min(start + chunkSize, file.size);
const chunk = file.slice(start, end);
await uploadChunk(chunk, i, chunks);
}
}
```
## Error Responses
### 413 Request Entity Too Large
When a request exceeds size limits:
```json
{
"error": "Request too large",
"max_size": 52428800,
"your_size": 75000000,
"max_size_mb": 50.0
}
```
### File-Specific Errors
For audio files:
```json
{
"error": "Audio file too large",
"max_size": 26214400,
"your_size": 35000000,
"max_size_mb": 25.0
}
```
For JSON payloads:
```json
{
"error": "JSON payload too large",
"max_size": 1048576,
"your_size": 2000000,
"max_size_kb": 1024.0
}
```
## Best Practices
### 1. Client-Side Validation
Always validate file sizes on the client side:
```javascript
// Add to static/js/app.js
const SIZE_LIMITS = {
audio: 25 * 1024 * 1024, // 25MB
json: 1 * 1024 * 1024, // 1MB
};
function checkFileSize(file, type) {
const limit = SIZE_LIMITS[type];
if (file.size > limit) {
showError(`File too large. Maximum size: ${formatSize(limit)}`);
return false;
}
return true;
}
```
### 2. Progressive Enhancement
For better UX with large files:
- Show upload progress
- Implement resumable uploads
- Compress audio client-side when possible
- Use appropriate audio formats (WebM/Opus for smaller sizes)
### 3. Server Configuration
Configure your web server (Nginx/Apache) to also enforce limits:
**Nginx:**
```nginx
client_max_body_size 50M;
client_body_buffer_size 1M;
```
**Apache:**
```apache
LimitRequestBody 52428800
```
### 4. Monitoring
Monitor size limit violations:
- Track 413 errors in logs
- Alert on repeated violations from same IP
- Adjust limits based on usage patterns
## Security Considerations
1. **Memory Protection**: Pre-flight size checks prevent loading large files into memory
2. **DoS Prevention**: Limits prevent attackers from exhausting server resources
3. **Bandwidth Protection**: Prevents bandwidth exhaustion from large uploads
4. **Storage Protection**: Works with session management to limit total storage per user
## Integration with Other Systems
### Rate Limiting
Size limits work in conjunction with rate limiting:
- Large requests count more against rate limits
- Repeated size violations can trigger IP blocking
### Session Management
Size limits are enforced per session:
- Total storage per session is limited
- Large files count against session resource limits
### Monitoring
Size limit violations are tracked in:
- Application logs
- Health check endpoints
- Admin monitoring dashboards
## Troubleshooting
### Common Issues
#### 1. Legitimate Large Files Rejected
If users need to upload larger files:
```bash
# Increase limit for audio files to 50MB
curl -X POST -H "X-Admin-Token: token" \
-d '{"max_audio_size": "50MB"}' \
http://localhost:5005/admin/size-limits
```
#### 2. Chunked Transfer Encoding
For requests without Content-Length header:
- The system monitors the stream
- Terminates connection if size exceeded
- May require special handling for some clients
#### 3. Load Balancer Limits
Ensure your load balancer also enforces appropriate limits:
- AWS ALB: Configure request size limits
- Cloudflare: Set upload size limits
- Nginx: Configure client_max_body_size
## Performance Impact
The size limiting system has minimal performance impact:
- Pre-flight checks are O(1) operations
- No buffering of large requests
- Early termination of oversized requests
- Efficient memory usage
## Future Enhancements
1. **Chunked Upload Support**: Native support for resumable uploads
2. **Compression Detection**: Automatic handling of compressed uploads
3. **Dynamic Limits**: Per-user or per-tier size limits
4. **Bandwidth Throttling**: Rate limit large uploads
5. **Storage Quotas**: Long-term storage limits per user