talk2me/REQUEST_SIZE_LIMITS.md
Adolfo Delorenzo aec2d3b0aa Add request size limits - Prevents memory exhaustion from large uploads
This comprehensive request size limiting system prevents memory exhaustion and DoS attacks from oversized requests.

Key features:
- Global request size limit: 50MB (configurable)
- Type-specific limits: 25MB for audio, 1MB for JSON, 10MB for images
- Multi-layer validation before loading data into memory
- File type detection based on extensions
- Endpoint-specific limit enforcement
- Dynamic configuration via admin API
- Clear error messages with size information

Implementation details:
- RequestSizeLimiter middleware with Flask integration
- Pre-request validation using Content-Length header
- File size checking for multipart uploads
- JSON payload size validation
- Custom decorator for route-specific limits
- StreamSizeLimiter for chunked transfers
- Integration with Flask's MAX_CONTENT_LENGTH

Admin features:
- GET /admin/size-limits - View current limits
- POST /admin/size-limits - Update limits dynamically
- Human-readable size formatting in responses
- Size limit info in health check endpoints

Security benefits:
- Prevents memory exhaustion attacks
- Blocks oversized uploads before processing
- Protects against buffer overflow attempts
- Works with rate limiting for comprehensive protection

This addresses the critical security issue of unbounded request sizes that could lead to memory exhaustion or system crashes.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 00:58:14 -06:00

7.9 KiB

Request Size Limits Documentation

This document describes the request size limiting system implemented in Talk2Me to prevent memory exhaustion from large uploads.

Overview

Talk2Me implements comprehensive request size limiting to protect against:

  • Memory exhaustion from large file uploads
  • Denial of Service (DoS) attacks using oversized requests
  • Buffer overflow attempts
  • Resource starvation from unbounded requests

Default Limits

Global Limits

  • Maximum Content Length: 50MB - Absolute maximum for any request
  • Maximum Audio File Size: 25MB - For audio uploads (transcription)
  • Maximum JSON Payload: 1MB - For API requests
  • Maximum Image Size: 10MB - For future image processing features
  • Maximum Chunk Size: 1MB - For streaming uploads

Features

1. Multi-Layer Protection

The system implements multiple layers of size checking:

  • Flask's built-in MAX_CONTENT_LENGTH configuration
  • Pre-request validation before data is loaded into memory
  • File-type specific limits
  • Endpoint-specific limits
  • Streaming request monitoring

2. File Type Detection

Automatic detection and enforcement based on file extensions:

  • Audio files: .wav, .mp3, .ogg, .webm, .m4a, .flac, .aac
  • Image files: .jpg, .jpeg, .png, .gif, .webp, .bmp
  • JSON payloads: Content-Type header detection

3. Graceful Error Handling

When limits are exceeded:

  • Returns 413 (Request Entity Too Large) status code
  • Provides clear error messages with size information
  • Includes both actual and allowed sizes
  • Human-readable size formatting

Configuration

Environment Variables

# Set limits via environment variables (in bytes)
export MAX_CONTENT_LENGTH=52428800      # 50MB
export MAX_AUDIO_SIZE=26214400          # 25MB
export MAX_JSON_SIZE=1048576            # 1MB
export MAX_IMAGE_SIZE=10485760          # 10MB

Flask Configuration

# In config.py or app.py
app.config.update({
    'MAX_CONTENT_LENGTH': 50 * 1024 * 1024,    # 50MB
    'MAX_AUDIO_SIZE': 25 * 1024 * 1024,        # 25MB
    'MAX_JSON_SIZE': 1 * 1024 * 1024,          # 1MB
    'MAX_IMAGE_SIZE': 10 * 1024 * 1024         # 10MB
})

Dynamic Configuration

Size limits can be updated at runtime via admin API.

API Endpoints

GET /admin/size-limits

Get current size limits.

curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/size-limits

Response:

{
  "limits": {
    "max_content_length": 52428800,
    "max_audio_size": 26214400,
    "max_json_size": 1048576,
    "max_image_size": 10485760
  },
  "limits_human": {
    "max_content_length": "50.0MB",
    "max_audio_size": "25.0MB",
    "max_json_size": "1.0MB",
    "max_image_size": "10.0MB"
  }
}

POST /admin/size-limits

Update size limits dynamically.

curl -X POST -H "X-Admin-Token: your-token" \
  -H "Content-Type: application/json" \
  -d '{"max_audio_size": "30MB", "max_json_size": 2097152}' \
  http://localhost:5005/admin/size-limits

Response:

{
  "success": true,
  "old_limits": {...},
  "new_limits": {...},
  "new_limits_human": {
    "max_audio_size": "30.0MB",
    "max_json_size": "2.0MB"
  }
}

Usage Examples

1. Endpoint-Specific Limits

@app.route('/upload')
@limit_request_size(max_size=10*1024*1024)  # 10MB limit
def upload():
    # Handle upload
    pass

@app.route('/upload-audio')
@limit_request_size(max_audio_size=30*1024*1024)  # 30MB for audio
def upload_audio():
    # Handle audio upload
    pass

2. Client-Side Validation

// Check file size before upload
const MAX_AUDIO_SIZE = 25 * 1024 * 1024; // 25MB

function validateAudioFile(file) {
    if (file.size > MAX_AUDIO_SIZE) {
        alert(`Audio file too large. Maximum size is ${MAX_AUDIO_SIZE / 1024 / 1024}MB`);
        return false;
    }
    return true;
}

3. Chunked Uploads (Future Enhancement)

// For files larger than limits, use chunked upload
async function uploadLargeFile(file, chunkSize = 1024 * 1024) {
    const chunks = Math.ceil(file.size / chunkSize);
    
    for (let i = 0; i < chunks; i++) {
        const start = i * chunkSize;
        const end = Math.min(start + chunkSize, file.size);
        const chunk = file.slice(start, end);
        
        await uploadChunk(chunk, i, chunks);
    }
}

Error Responses

413 Request Entity Too Large

When a request exceeds size limits:

{
  "error": "Request too large",
  "max_size": 52428800,
  "your_size": 75000000,
  "max_size_mb": 50.0
}

File-Specific Errors

For audio files:

{
  "error": "Audio file too large",
  "max_size": 26214400,
  "your_size": 35000000,
  "max_size_mb": 25.0
}

For JSON payloads:

{
  "error": "JSON payload too large",
  "max_size": 1048576,
  "your_size": 2000000,
  "max_size_kb": 1024.0
}

Best Practices

1. Client-Side Validation

Always validate file sizes on the client side:

// Add to static/js/app.js
const SIZE_LIMITS = {
    audio: 25 * 1024 * 1024,  // 25MB
    json: 1 * 1024 * 1024,    // 1MB
};

function checkFileSize(file, type) {
    const limit = SIZE_LIMITS[type];
    if (file.size > limit) {
        showError(`File too large. Maximum size: ${formatSize(limit)}`);
        return false;
    }
    return true;
}

2. Progressive Enhancement

For better UX with large files:

  • Show upload progress
  • Implement resumable uploads
  • Compress audio client-side when possible
  • Use appropriate audio formats (WebM/Opus for smaller sizes)

3. Server Configuration

Configure your web server (Nginx/Apache) to also enforce limits:

Nginx:

client_max_body_size 50M;
client_body_buffer_size 1M;

Apache:

LimitRequestBody 52428800

4. Monitoring

Monitor size limit violations:

  • Track 413 errors in logs
  • Alert on repeated violations from same IP
  • Adjust limits based on usage patterns

Security Considerations

  1. Memory Protection: Pre-flight size checks prevent loading large files into memory
  2. DoS Prevention: Limits prevent attackers from exhausting server resources
  3. Bandwidth Protection: Prevents bandwidth exhaustion from large uploads
  4. Storage Protection: Works with session management to limit total storage per user

Integration with Other Systems

Rate Limiting

Size limits work in conjunction with rate limiting:

  • Large requests count more against rate limits
  • Repeated size violations can trigger IP blocking

Session Management

Size limits are enforced per session:

  • Total storage per session is limited
  • Large files count against session resource limits

Monitoring

Size limit violations are tracked in:

  • Application logs
  • Health check endpoints
  • Admin monitoring dashboards

Troubleshooting

Common Issues

1. Legitimate Large Files Rejected

If users need to upload larger files:

# Increase limit for audio files to 50MB
curl -X POST -H "X-Admin-Token: token" \
  -d '{"max_audio_size": "50MB"}' \
  http://localhost:5005/admin/size-limits

2. Chunked Transfer Encoding

For requests without Content-Length header:

  • The system monitors the stream
  • Terminates connection if size exceeded
  • May require special handling for some clients

3. Load Balancer Limits

Ensure your load balancer also enforces appropriate limits:

  • AWS ALB: Configure request size limits
  • Cloudflare: Set upload size limits
  • Nginx: Configure client_max_body_size

Performance Impact

The size limiting system has minimal performance impact:

  • Pre-flight checks are O(1) operations
  • No buffering of large requests
  • Early termination of oversized requests
  • Efficient memory usage

Future Enhancements

  1. Chunked Upload Support: Native support for resumable uploads
  2. Compression Detection: Automatic handling of compressed uploads
  3. Dynamic Limits: Per-user or per-tier size limits
  4. Bandwidth Throttling: Rate limit large uploads
  5. Storage Quotas: Long-term storage limits per user