diff --git a/CONNECTION_RETRY.md b/CONNECTION_RETRY.md deleted file mode 100644 index 648592b..0000000 --- a/CONNECTION_RETRY.md +++ /dev/null @@ -1,173 +0,0 @@ -# Connection Retry Logic Documentation - -This document explains the connection retry and network interruption handling features in Talk2Me. - -## Overview - -Talk2Me implements robust connection retry logic to handle network interruptions gracefully. When a connection is lost or a request fails due to network issues, the application automatically queues requests and retries them when the connection is restored. - -## Features - -### 1. Automatic Connection Monitoring -- Monitors browser online/offline events -- Periodic health checks to the server (every 5 seconds when offline) -- Visual connection status indicator -- Automatic detection when returning from sleep/hibernation - -### 2. Request Queuing -- Failed requests are automatically queued during network interruptions -- Requests maintain their priority and are processed in order -- Queue persists across connection failures -- Visual indication of queued requests - -### 3. Exponential Backoff Retry -- Failed requests are retried with exponential backoff -- Initial retry delay: 1 second -- Maximum retry delay: 30 seconds -- Backoff multiplier: 2x -- Maximum retries: 3 attempts - -### 4. Connection Status UI -- Real-time connection status indicator (bottom-right corner) -- Offline banner with retry button -- Queue status showing pending requests by type -- Temporary status messages for important events - -## User Experience - -### When Connection is Lost - -1. **Visual Indicators**: - - Connection status shows "Offline" or "Connection error" - - Red banner appears at top of screen - - Queued request count is displayed - -2. **Request Handling**: - - New requests are automatically queued - - User sees "Connection error - queued" message - - Requests will be sent when connection returns - -3. **Manual Retry**: - - Users can click "Retry" button in offline banner - - Forces immediate connection check - -### When Connection is Restored - -1. **Automatic Recovery**: - - Connection status changes to "Connecting..." - - Queued requests are processed automatically - - Success message shown briefly - -2. **Request Processing**: - - Queued requests maintain their order - - Higher priority requests (transcription) processed first - - Progress indicators show processing status - -## Configuration - -The connection retry logic can be configured programmatically: - -```javascript -// In app.ts or initialization code -connectionManager.configure({ - maxRetries: 3, // Maximum retry attempts - initialDelay: 1000, // Initial retry delay (ms) - maxDelay: 30000, // Maximum retry delay (ms) - backoffMultiplier: 2, // Exponential backoff multiplier - timeout: 10000, // Request timeout (ms) - onlineCheckInterval: 5000 // Health check interval (ms) -}); -``` - -## Request Priority - -Requests are prioritized as follows: -1. **Transcription** (Priority: 8) - Highest priority -2. **Translation** (Priority: 5) - Normal priority -3. **TTS/Audio** (Priority: 3) - Lower priority - -## Error Types - -### Retryable Errors -- Network errors -- Connection timeouts -- Server errors (5xx) -- CORS errors (in some cases) - -### Non-Retryable Errors -- Client errors (4xx) -- Authentication errors -- Rate limit errors -- Invalid request errors - -## Best Practices - -1. **For Users**: - - Wait for queued requests to complete before closing the app - - Use the manual retry button if automatic recovery fails - - Check the connection status indicator for current state - -2. **For Developers**: - - All fetch requests should go through RequestQueueManager - - Use appropriate request priorities - - Handle both online and offline scenarios in UI - - Provide clear feedback about connection status - -## Technical Implementation - -### Key Components - -1. **ConnectionManager** (`connectionManager.ts`): - - Monitors connection state - - Implements retry logic with exponential backoff - - Provides connection state subscriptions - -2. **RequestQueueManager** (`requestQueue.ts`): - - Queues failed requests - - Integrates with ConnectionManager - - Handles request prioritization - -3. **ConnectionUI** (`connectionUI.ts`): - - Displays connection status - - Shows offline banner - - Updates queue information - -### Integration Example - -```typescript -// Automatic integration through RequestQueueManager -const queue = RequestQueueManager.getInstance(); -const data = await queue.enqueue( - 'translate', // Request type - async () => { - // Your fetch request - const response = await fetch('/api/translate', options); - return response.json(); - }, - 5 // Priority (1-10, higher = more important) -); -``` - -## Troubleshooting - -### Connection Not Detected -- Check browser permissions for network status -- Ensure health endpoint (/health) is accessible -- Verify no firewall/proxy blocking - -### Requests Not Retrying -- Check browser console for errors -- Verify request type is retryable -- Check if max retries exceeded - -### Queue Not Processing -- Manually trigger retry with button -- Check if requests are timing out -- Verify server is responding - -## Future Enhancements - -- Persistent queue storage (survive page refresh) -- Configurable retry strategies per request type -- Network speed detection and adaptation -- Progressive web app offline mode \ No newline at end of file diff --git a/CORS_CONFIG.md b/CORS_CONFIG.md deleted file mode 100644 index 3741fe8..0000000 --- a/CORS_CONFIG.md +++ /dev/null @@ -1,152 +0,0 @@ -# CORS Configuration Guide - -This document explains how to configure Cross-Origin Resource Sharing (CORS) for the Talk2Me application. - -## Overview - -CORS is configured using Flask-CORS to enable secure cross-origin usage of the API endpoints. This allows the Talk2Me application to be embedded in other websites or accessed from different domains while maintaining security. - -## Environment Variables - -### `CORS_ORIGINS` - -Controls which domains are allowed to access the API endpoints. - -- **Default**: `*` (allows all origins - use only for development) -- **Production Example**: `https://yourdomain.com,https://app.yourdomain.com` -- **Format**: Comma-separated list of allowed origins - -```bash -# Development (allows all origins) -export CORS_ORIGINS="*" - -# Production (restrict to specific domains) -export CORS_ORIGINS="https://talk2me.example.com,https://app.example.com" -``` - -### `ADMIN_CORS_ORIGINS` - -Controls which domains can access admin endpoints (more restrictive). - -- **Default**: `http://localhost:*` (allows all localhost ports) -- **Production Example**: `https://admin.yourdomain.com` -- **Format**: Comma-separated list of allowed admin origins - -```bash -# Development -export ADMIN_CORS_ORIGINS="http://localhost:*" - -# Production -export ADMIN_CORS_ORIGINS="https://admin.talk2me.example.com" -``` - -## Configuration Details - -The CORS configuration includes: - -- **Allowed Methods**: GET, POST, OPTIONS -- **Allowed Headers**: Content-Type, Authorization, X-Requested-With, X-Admin-Token -- **Exposed Headers**: Content-Range, X-Content-Range -- **Credentials Support**: Enabled (supports cookies and authorization headers) -- **Max Age**: 3600 seconds (preflight requests cached for 1 hour) - -## Endpoints - -All endpoints have CORS enabled with the following configuration: - -### Regular API Endpoints -- `/api/*` -- `/transcribe` -- `/translate` -- `/translate/stream` -- `/speak` -- `/get_audio/*` -- `/check_tts_server` -- `/update_tts_config` -- `/health/*` - -### Admin Endpoints (More Restrictive) -- `/admin/*` - Uses `ADMIN_CORS_ORIGINS` instead of general `CORS_ORIGINS` - -## Security Best Practices - -1. **Never use `*` in production** - Always specify exact allowed origins -2. **Use HTTPS** - Always use HTTPS URLs in production CORS origins -3. **Separate admin origins** - Keep admin endpoints on a separate, more restrictive origin list -4. **Review regularly** - Periodically review and update allowed origins - -## Example Configurations - -### Local Development -```bash -export CORS_ORIGINS="*" -export ADMIN_CORS_ORIGINS="http://localhost:*" -``` - -### Staging Environment -```bash -export CORS_ORIGINS="https://staging.talk2me.com,https://staging-app.talk2me.com" -export ADMIN_CORS_ORIGINS="https://staging-admin.talk2me.com" -``` - -### Production Environment -```bash -export CORS_ORIGINS="https://talk2me.com,https://app.talk2me.com" -export ADMIN_CORS_ORIGINS="https://admin.talk2me.com" -``` - -### Mobile App Integration -```bash -# Include mobile app schemes if needed -export CORS_ORIGINS="https://talk2me.com,https://app.talk2me.com,capacitor://localhost,ionic://localhost" -``` - -## Testing CORS Configuration - -You can test CORS configuration using curl: - -```bash -# Test preflight request -curl -X OPTIONS https://your-api.com/api/transcribe \ - -H "Origin: https://allowed-origin.com" \ - -H "Access-Control-Request-Method: POST" \ - -H "Access-Control-Request-Headers: Content-Type" \ - -v - -# Test actual request -curl -X POST https://your-api.com/api/transcribe \ - -H "Origin: https://allowed-origin.com" \ - -H "Content-Type: application/json" \ - -d '{"test": "data"}' \ - -v -``` - -## Troubleshooting - -### CORS Errors in Browser Console - -If you see CORS errors: - -1. Check that the origin is included in `CORS_ORIGINS` -2. Ensure the URL protocol matches (http vs https) -3. Check for trailing slashes in origins -4. Verify environment variables are set correctly - -### Common Issues - -1. **"No 'Access-Control-Allow-Origin' header"** - - Origin not in allowed list - - Check `CORS_ORIGINS` environment variable - -2. **"CORS policy: The request client is not a secure context"** - - Using HTTP instead of HTTPS - - Update to use HTTPS in production - -3. **"CORS policy: Credentials flag is true, but Access-Control-Allow-Credentials is not 'true'"** - - This should not occur with current configuration - - Check that `supports_credentials` is True in CORS config - -## Additional Resources - -- [MDN CORS Documentation](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) -- [Flask-CORS Documentation](https://flask-cors.readthedocs.io/) \ No newline at end of file diff --git a/ERROR_LOGGING.md b/ERROR_LOGGING.md deleted file mode 100644 index f77cd2a..0000000 --- a/ERROR_LOGGING.md +++ /dev/null @@ -1,460 +0,0 @@ -# Error Logging Documentation - -This document describes the comprehensive error logging system implemented in Talk2Me for debugging production issues. - -## Overview - -Talk2Me implements a structured logging system that provides: -- JSON-formatted structured logs for easy parsing -- Multiple log streams (app, errors, access, security, performance) -- Automatic log rotation to prevent disk space issues -- Request tracing with unique IDs -- Performance metrics collection -- Security event tracking -- Error deduplication and frequency tracking - -## Log Types - -### 1. Application Logs (`logs/talk2me.log`) -General application logs including info, warnings, and debug messages. - -```json -{ - "timestamp": "2024-01-15T10:30:45.123Z", - "level": "INFO", - "logger": "talk2me", - "message": "Whisper model loaded successfully", - "app": "talk2me", - "environment": "production", - "hostname": "server-1", - "thread": "MainThread", - "process": 12345 -} -``` - -### 2. Error Logs (`logs/errors.log`) -Dedicated error logging with full exception details and stack traces. - -```json -{ - "timestamp": "2024-01-15T10:31:00.456Z", - "level": "ERROR", - "logger": "talk2me.errors", - "message": "Error in transcribe: File too large", - "exception": { - "type": "ValueError", - "message": "Audio file exceeds maximum size", - "traceback": ["...full stack trace..."] - }, - "request_id": "1234567890-abcdef", - "endpoint": "transcribe", - "method": "POST", - "path": "/transcribe", - "ip": "192.168.1.100" -} -``` - -### 3. Access Logs (`logs/access.log`) -HTTP request/response logging for traffic analysis. - -```json -{ - "timestamp": "2024-01-15T10:32:00.789Z", - "level": "INFO", - "message": "request_complete", - "request_id": "1234567890-abcdef", - "method": "POST", - "path": "/transcribe", - "status": 200, - "duration_ms": 1250, - "content_length": 4096, - "ip": "192.168.1.100", - "user_agent": "Mozilla/5.0..." -} -``` - -### 4. Security Logs (`logs/security.log`) -Security-related events and suspicious activities. - -```json -{ - "timestamp": "2024-01-15T10:33:00.123Z", - "level": "WARNING", - "message": "Security event: rate_limit_exceeded", - "event": "rate_limit_exceeded", - "severity": "warning", - "ip": "192.168.1.100", - "endpoint": "/transcribe", - "attempts": 15, - "blocked": true -} -``` - -### 5. Performance Logs (`logs/performance.log`) -Performance metrics and slow request tracking. - -```json -{ - "timestamp": "2024-01-15T10:34:00.456Z", - "level": "INFO", - "message": "Performance metric: transcribe_audio", - "metric": "transcribe_audio", - "duration_ms": 2500, - "function": "transcribe", - "module": "app", - "request_id": "1234567890-abcdef" -} -``` - -## Configuration - -### Environment Variables - -```bash -# Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL) -export LOG_LEVEL=INFO - -# Log file paths -export LOG_FILE=logs/talk2me.log -export ERROR_LOG_FILE=logs/errors.log - -# Log rotation settings -export LOG_MAX_BYTES=52428800 # 50MB -export LOG_BACKUP_COUNT=10 # Keep 10 backup files - -# Environment -export FLASK_ENV=production -``` - -### Flask Configuration - -```python -app.config.update({ - 'LOG_LEVEL': 'INFO', - 'LOG_FILE': 'logs/talk2me.log', - 'ERROR_LOG_FILE': 'logs/errors.log', - 'LOG_MAX_BYTES': 50 * 1024 * 1024, - 'LOG_BACKUP_COUNT': 10 -}) -``` - -## Admin API Endpoints - -### GET /admin/logs/errors -View recent error logs and error frequency statistics. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/errors -``` - -Response: -```json -{ - "error_summary": { - "abc123def456": { - "count_last_hour": 5, - "last_seen": 1705320000 - } - }, - "recent_errors": [...], - "total_errors_logged": 150 -} -``` - -### GET /admin/logs/performance -View performance metrics and slow requests. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/performance -``` - -Response: -```json -{ - "performance_metrics": { - "transcribe_audio": { - "avg_ms": 850.5, - "max_ms": 3200, - "min_ms": 125, - "count": 1024 - } - }, - "slow_requests": [ - { - "metric": "transcribe_audio", - "duration_ms": 3200, - "timestamp": "2024-01-15T10:35:00Z" - } - ] -} -``` - -### GET /admin/logs/security -View security events and suspicious activities. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/security -``` - -Response: -```json -{ - "security_events": [...], - "event_summary": { - "rate_limit_exceeded": 25, - "suspicious_error": 3, - "high_error_rate": 1 - }, - "total_events": 29 -} -``` - -## Usage Patterns - -### 1. Logging Errors with Context - -```python -from error_logger import log_exception - -try: - # Some operation - process_audio(file) -except Exception as e: - log_exception( - e, - message="Failed to process audio", - user_id=user.id, - file_size=file.size, - file_type=file.content_type - ) -``` - -### 2. Performance Monitoring - -```python -from error_logger import log_performance - -@log_performance('expensive_operation') -def process_large_file(file): - # This will automatically log execution time - return processed_data -``` - -### 3. Security Event Logging - -```python -app.error_logger.log_security( - 'unauthorized_access', - severity='warning', - ip=request.remote_addr, - attempted_resource='/admin', - user_agent=request.headers.get('User-Agent') -) -``` - -### 4. Request Context - -```python -from error_logger import log_context - -with log_context(user_id=user.id, feature='translation'): - # All logs within this context will include user_id and feature - translate_text(text) -``` - -## Log Analysis - -### Finding Specific Errors - -```bash -# Find all authentication errors -grep '"error_type":"AuthenticationError"' logs/errors.log | jq . - -# Find errors from specific IP -grep '"ip":"192.168.1.100"' logs/errors.log | jq . - -# Find errors in last hour -grep "$(date -u -d '1 hour ago' +%Y-%m-%dT%H)" logs/errors.log | jq . -``` - -### Performance Analysis - -```bash -# Find slow requests (>2000ms) -jq 'select(.extra_fields.duration_ms > 2000)' logs/performance.log - -# Calculate average response time for endpoint -jq 'select(.extra_fields.metric == "transcribe_audio") | .extra_fields.duration_ms' logs/performance.log | awk '{sum+=$1; count++} END {print sum/count}' -``` - -### Security Monitoring - -```bash -# Count security events by type -jq '.extra_fields.event' logs/security.log | sort | uniq -c - -# Find all blocked IPs -jq 'select(.extra_fields.blocked == true) | .extra_fields.ip' logs/security.log | sort -u -``` - -## Log Rotation - -Logs are automatically rotated based on size or time: - -- **Application/Error logs**: Rotate at 50MB, keep 10 backups -- **Access logs**: Daily rotation, keep 30 days -- **Performance logs**: Hourly rotation, keep 7 days -- **Security logs**: Rotate at 50MB, keep 10 backups - -Rotated logs are named with numeric suffixes: -- `talk2me.log` (current) -- `talk2me.log.1` (most recent backup) -- `talk2me.log.2` (older backup) -- etc. - -## Best Practices - -### 1. Structured Logging - -Always include relevant context: -```python -logger.info("User action completed", extra={ - 'extra_fields': { - 'user_id': user.id, - 'action': 'upload_audio', - 'file_size': file.size, - 'duration_ms': processing_time - } -}) -``` - -### 2. Error Handling - -Log errors at appropriate levels: -```python -try: - result = risky_operation() -except ValidationError as e: - logger.warning(f"Validation failed: {e}") # Expected errors -except Exception as e: - logger.error(f"Unexpected error: {e}", exc_info=True) # Unexpected errors -``` - -### 3. Performance Tracking - -Track key operations: -```python -start = time.time() -result = expensive_operation() -duration = (time.time() - start) * 1000 - -app.error_logger.log_performance( - 'expensive_operation', - value=duration, - input_size=len(data), - output_size=len(result) -) -``` - -### 4. Security Awareness - -Log security-relevant events: -```python -if failed_attempts > 3: - app.error_logger.log_security( - 'multiple_failed_attempts', - severity='warning', - ip=request.remote_addr, - attempts=failed_attempts - ) -``` - -## Monitoring Integration - -### Prometheus Metrics - -Export log metrics for Prometheus: -```python -@app.route('/metrics') -def prometheus_metrics(): - error_summary = app.error_logger.get_error_summary() - # Format as Prometheus metrics - return format_prometheus_metrics(error_summary) -``` - -### ELK Stack - -Ship logs to Elasticsearch: -```yaml -filebeat.inputs: -- type: log - paths: - - /app/logs/*.log - json.keys_under_root: true - json.add_error_key: true -``` - -### CloudWatch - -For AWS deployments: -```python -# Install boto3 and watchtower -import watchtower -cloudwatch_handler = watchtower.CloudWatchLogHandler() -logger.addHandler(cloudwatch_handler) -``` - -## Troubleshooting - -### Common Issues - -#### 1. Logs Not Being Written - -Check permissions: -```bash -ls -la logs/ -# Should show write permissions for app user -``` - -Create logs directory: -```bash -mkdir -p logs -chmod 755 logs -``` - -#### 2. Disk Space Issues - -Monitor log sizes: -```bash -du -sh logs/* -``` - -Force rotation: -```bash -# Manually rotate logs -mv logs/talk2me.log logs/talk2me.log.backup -# App will create new log file -``` - -#### 3. Performance Impact - -If logging impacts performance: -- Increase LOG_LEVEL to WARNING or ERROR -- Reduce backup count -- Use asynchronous logging (future enhancement) - -## Security Considerations - -1. **Log Sanitization**: Sensitive data is automatically masked -2. **Access Control**: Admin endpoints require authentication -3. **Log Retention**: Old logs are automatically deleted -4. **Encryption**: Consider encrypting logs at rest in production -5. **Audit Trail**: All log access is itself logged - -## Future Enhancements - -1. **Centralized Logging**: Ship logs to centralized service -2. **Real-time Alerts**: Trigger alerts on error patterns -3. **Log Analytics**: Built-in log analysis dashboard -4. **Correlation IDs**: Track requests across microservices -5. **Async Logging**: Reduce performance impact \ No newline at end of file diff --git a/GPU_SUPPORT.md b/GPU_SUPPORT.md deleted file mode 100644 index f67ad6d..0000000 --- a/GPU_SUPPORT.md +++ /dev/null @@ -1,68 +0,0 @@ -# GPU Support for Talk2Me - -## Current GPU Support Status - -### ✅ NVIDIA GPUs (Full Support) -- **Requirements**: CUDA 11.x or 12.x -- **Optimizations**: - - TensorFloat-32 (TF32) for Ampere GPUs (RTX 30xx, A100) - - cuDNN auto-tuning - - Half-precision (FP16) inference - - CUDA kernel pre-caching - - Memory pre-allocation - -### ⚠️ AMD GPUs (Limited Support) -- **Requirements**: ROCm 5.x installation -- **Status**: Falls back to CPU unless ROCm is properly configured -- **To enable AMD GPU**: - ```bash - # Install PyTorch with ROCm support - pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6 - ``` -- **Limitations**: - - No cuDNN optimizations - - May have compatibility issues - - Performance varies by GPU model - -### ✅ Apple Silicon (M1/M2/M3) -- **Requirements**: macOS 12.3+ -- **Status**: Uses Metal Performance Shaders (MPS) -- **Optimizations**: - - Native Metal acceleration - - Unified memory architecture benefits - - No FP16 (not well supported on MPS yet) - -### 📊 Performance Comparison - -| GPU Type | First Transcription | Subsequent | Notes | -|----------|-------------------|------------|-------| -| NVIDIA RTX 3080 | ~2s | ~0.5s | Full optimizations | -| AMD RX 6800 XT | ~3-4s | ~1-2s | With ROCm | -| Apple M2 | ~2.5s | ~1s | MPS acceleration | -| CPU (i7-12700K) | ~5-10s | ~5-10s | No acceleration | - -## Checking Your GPU Status - -Run the app and check the logs: -``` -INFO: NVIDIA GPU detected - using CUDA acceleration -INFO: GPU memory allocated: 542.00 MB -INFO: Whisper model loaded and optimized for NVIDIA GPU -``` - -## Troubleshooting - -### AMD GPU Not Detected -1. Install ROCm-compatible PyTorch -2. Set environment variable: `export HSA_OVERRIDE_GFX_VERSION=10.3.0` -3. Check with: `rocm-smi` - -### NVIDIA GPU Not Used -1. Check CUDA installation: `nvidia-smi` -2. Verify PyTorch CUDA: `python -c "import torch; print(torch.cuda.is_available())"` -3. Install CUDA toolkit if needed - -### Apple Silicon Not Accelerated -1. Update macOS to 12.3+ -2. Update PyTorch: `pip install --upgrade torch` -3. Check MPS: `python -c "import torch; print(torch.backends.mps.is_available())"` \ No newline at end of file diff --git a/MEMORY_MANAGEMENT.md b/MEMORY_MANAGEMENT.md deleted file mode 100644 index ffdce8d..0000000 --- a/MEMORY_MANAGEMENT.md +++ /dev/null @@ -1,285 +0,0 @@ -# Memory Management Documentation - -This document describes the comprehensive memory management system implemented in Talk2Me to prevent memory leaks and crashes after extended use. - -## Overview - -Talk2Me implements a dual-layer memory management system: -1. **Backend (Python)**: Manages GPU memory, Whisper model, and temporary files -2. **Frontend (JavaScript)**: Manages audio blobs, object URLs, and Web Audio contexts - -## Memory Leak Issues Addressed - -### Backend Memory Leaks - -1. **GPU Memory Fragmentation** - - Whisper model accumulates GPU memory over time - - Solution: Periodic GPU cache clearing and model reloading - -2. **Temporary File Accumulation** - - Audio files not cleaned up quickly enough under load - - Solution: Aggressive cleanup with tracking and periodic sweeps - -3. **Session Resource Leaks** - - Long-lived sessions accumulate resources - - Solution: Integration with session manager for resource limits - -### Frontend Memory Leaks - -1. **Audio Blob Leaks** - - MediaRecorder chunks kept in memory - - Solution: SafeMediaRecorder wrapper with automatic cleanup - -2. **Object URL Leaks** - - URLs created but not revoked - - Solution: Centralized tracking and automatic revocation - -3. **AudioContext Leaks** - - Contexts created but never closed - - Solution: MemoryManager tracks and closes contexts - -4. **MediaStream Leaks** - - Microphone streams not properly stopped - - Solution: Automatic track stopping and stream cleanup - -## Backend Memory Management - -### MemoryManager Class - -The `MemoryManager` monitors and manages memory usage: - -```python -memory_manager = MemoryManager(app, { - 'memory_threshold_mb': 4096, # 4GB process memory limit - 'gpu_memory_threshold_mb': 2048, # 2GB GPU memory limit - 'cleanup_interval': 30 # Check every 30 seconds -}) -``` - -### Features - -1. **Automatic Monitoring** - - Background thread checks memory usage - - Triggers cleanup when thresholds exceeded - - Logs statistics every 5 minutes - -2. **GPU Memory Management** - - Clears CUDA cache after each operation - - Reloads Whisper model if fragmentation detected - - Tracks reload count and timing - -3. **Temporary File Cleanup** - - Tracks all temporary files - - Age-based cleanup (5 minutes normal, 1 minute aggressive) - - Cleanup on process exit - -4. **Context Managers** - ```python - with AudioProcessingContext(memory_manager) as ctx: - # Process audio - ctx.add_temp_file(temp_path) - # Files automatically cleaned up - ``` - -### Admin Endpoints - -- `GET /admin/memory` - View current memory statistics -- `POST /admin/memory/cleanup` - Trigger manual cleanup - -## Frontend Memory Management - -### MemoryManager Class - -Centralized tracking of all browser resources: - -```typescript -const memoryManager = MemoryManager.getInstance(); - -// Register resources -memoryManager.registerAudioContext(context); -memoryManager.registerObjectURL(url); -memoryManager.registerMediaStream(stream); -``` - -### SafeMediaRecorder - -Wrapper for MediaRecorder with automatic cleanup: - -```typescript -const recorder = new SafeMediaRecorder(); -await recorder.start(constraints); -// Recording... -const blob = await recorder.stop(); // Automatically cleans up -``` - -### AudioBlobHandler - -Safe handling of audio blobs and object URLs: - -```typescript -const handler = new AudioBlobHandler(blob); -const url = handler.getObjectURL(); // Tracked automatically -// Use URL... -handler.cleanup(); // Revokes URL and clears references -``` - -## Memory Thresholds - -### Backend Thresholds - -| Resource | Default Limit | Configurable Via | -|----------|--------------|------------------| -| Process Memory | 4096 MB | MEMORY_THRESHOLD_MB | -| GPU Memory | 2048 MB | GPU_MEMORY_THRESHOLD_MB | -| Temp File Age | 300 seconds | Built-in | -| Model Reload Interval | 300 seconds | Built-in | - -### Frontend Thresholds - -| Resource | Cleanup Trigger | -|----------|----------------| -| Closed AudioContexts | Every 30 seconds | -| Stopped MediaStreams | Every 30 seconds | -| Orphaned Object URLs | On navigation/unload | - -## Best Practices - -### Backend - -1. **Use Context Managers** - ```python - @with_memory_management - def process_audio(): - # Automatic cleanup - ``` - -2. **Register Temporary Files** - ```python - register_temp_file(path) - ctx.add_temp_file(path) - ``` - -3. **Clear GPU Memory** - ```python - torch.cuda.empty_cache() - torch.cuda.synchronize() - ``` - -### Frontend - -1. **Use Safe Wrappers** - ```typescript - // Don't use raw MediaRecorder - const recorder = new SafeMediaRecorder(); - ``` - -2. **Clean Up Handlers** - ```typescript - if (audioHandler) { - audioHandler.cleanup(); - } - ``` - -3. **Register All Resources** - ```typescript - const context = new AudioContext(); - memoryManager.registerAudioContext(context); - ``` - -## Monitoring - -### Backend Monitoring - -```bash -# View memory stats -curl -H "X-Admin-Token: token" http://localhost:5005/admin/memory - -# Response -{ - "memory": { - "process_mb": 850.5, - "system_percent": 45.2, - "gpu_mb": 1250.0, - "gpu_percent": 61.0 - }, - "temp_files": { - "count": 5, - "size_mb": 12.5 - }, - "model": { - "reload_count": 2, - "last_reload": "2024-01-15T10:30:00" - } -} -``` - -### Frontend Monitoring - -```javascript -// Get memory stats -const stats = memoryManager.getStats(); -console.log('Active contexts:', stats.audioContexts); -console.log('Object URLs:', stats.objectURLs); -``` - -## Troubleshooting - -### High Memory Usage - -1. **Check Current Usage** - ```bash - curl -H "X-Admin-Token: token" http://localhost:5005/admin/memory - ``` - -2. **Trigger Manual Cleanup** - ```bash - curl -X POST -H "X-Admin-Token: token" \ - http://localhost:5005/admin/memory/cleanup - ``` - -3. **Check Logs** - ```bash - grep "Memory" logs/talk2me.log - grep "GPU memory" logs/talk2me.log - ``` - -### Memory Leak Symptoms - -1. **Backend** - - Process memory continuously increasing - - GPU memory not returning to baseline - - Temp files accumulating in upload folder - - Slower transcription over time - -2. **Frontend** - - Browser tab memory increasing - - Page becoming unresponsive - - Audio playback issues - - Console errors about contexts - -### Debug Mode - -Enable debug logging: -```python -# Backend -app.config['DEBUG_MEMORY'] = True - -# Frontend (in console) -localStorage.setItem('DEBUG_MEMORY', 'true'); -``` - -## Performance Impact - -Memory management adds minimal overhead: -- Backend: ~30ms per cleanup cycle -- Frontend: <5ms per resource registration -- Cleanup operations are non-blocking -- Model reloading takes ~2-3 seconds (rare) - -## Future Enhancements - -1. **Predictive Cleanup**: Clean resources based on usage patterns -2. **Memory Pooling**: Reuse audio buffers and contexts -3. **Distributed Memory**: Share memory stats across instances -4. **Alert System**: Notify admins of memory issues -5. **Auto-scaling**: Scale resources based on memory pressure \ No newline at end of file diff --git a/PRODUCTION_DEPLOYMENT.md b/PRODUCTION_DEPLOYMENT.md deleted file mode 100644 index 2de2481..0000000 --- a/PRODUCTION_DEPLOYMENT.md +++ /dev/null @@ -1,435 +0,0 @@ -# Production Deployment Guide - -This guide covers deploying Talk2Me in a production environment using a proper WSGI server. - -## Overview - -The Flask development server is not suitable for production use. This guide covers: -- Gunicorn as the WSGI server -- Nginx as a reverse proxy -- Docker for containerization -- Systemd for process management -- Security best practices - -## Quick Start with Docker - -### 1. Using Docker Compose - -```bash -# Clone the repository -git clone https://github.com/your-repo/talk2me.git -cd talk2me - -# Create .env file with production settings -cat > .env < backup.sql - -# Redis -redis-cli BGSAVE -``` - -### Application Backup - -```bash -# Backup application and logs -tar -czf talk2me-backup.tar.gz \ - /opt/talk2me \ - /var/log/talk2me \ - /etc/systemd/system/talk2me.service \ - /etc/nginx/sites-available/talk2me -``` - -## Troubleshooting - -### Service Won't Start - -```bash -# Check service status -systemctl status talk2me - -# Check logs -journalctl -u talk2me -n 100 - -# Test configuration -sudo -u talk2me /opt/talk2me/venv/bin/gunicorn --check-config wsgi:application -``` - -### High Memory Usage - -```bash -# Trigger cleanup -curl -X POST -H "X-Admin-Token: token" http://localhost:5005/admin/memory/cleanup - -# Restart workers -systemctl reload talk2me -``` - -### Slow Response Times - -1. Check worker count -2. Enable async workers -3. Check GPU availability -4. Review nginx buffering settings - -## Performance Optimization - -### 1. Enable GPU - -Ensure CUDA/ROCm is properly installed: - -```bash -# Check GPU -nvidia-smi # or rocm-smi - -# Set in environment -export CUDA_VISIBLE_DEVICES=0 -``` - -### 2. Optimize Workers - -```python -# For CPU-heavy workloads -workers = cpu_count() -threads = 1 - -# For I/O-heavy workloads -workers = cpu_count() * 2 -threads = 4 -``` - -### 3. Enable Caching - -Use Redis for caching translations: - -```python -CACHE_TYPE = 'redis' -CACHE_REDIS_URL = 'redis://localhost:6379/0' -``` - -## Maintenance - -### Regular Tasks - -1. **Log Rotation**: Configured automatically -2. **Database Cleanup**: Run weekly -3. **Model Updates**: Check for Whisper updates -4. **Security Updates**: Keep dependencies updated - -### Update Procedure - -```bash -# Backup first -./backup.sh - -# Update code -git pull - -# Update dependencies -sudo -u talk2me /opt/talk2me/venv/bin/pip install -r requirements-prod.txt - -# Restart service -sudo systemctl restart talk2me -``` - -## Rollback - -If deployment fails: - -```bash -# Stop service -sudo systemctl stop talk2me - -# Restore backup -tar -xzf talk2me-backup.tar.gz -C / - -# Restart service -sudo systemctl start talk2me -``` \ No newline at end of file diff --git a/RATE_LIMITING.md b/RATE_LIMITING.md deleted file mode 100644 index 9c15e0c..0000000 --- a/RATE_LIMITING.md +++ /dev/null @@ -1,235 +0,0 @@ -# Rate Limiting Documentation - -This document describes the rate limiting implementation in Talk2Me to protect against DoS attacks and resource exhaustion. - -## Overview - -Talk2Me implements a comprehensive rate limiting system with: -- Token bucket algorithm with sliding window -- Per-endpoint configurable limits -- IP-based blocking (temporary and permanent) -- Global request limits -- Concurrent request throttling -- Request size validation - -## Rate Limits by Endpoint - -### Transcription (`/transcribe`) -- **Per Minute**: 10 requests -- **Per Hour**: 100 requests -- **Burst Size**: 3 requests -- **Max Request Size**: 10MB -- **Token Refresh**: 1 token per 6 seconds - -### Translation (`/translate`) -- **Per Minute**: 20 requests -- **Per Hour**: 300 requests -- **Burst Size**: 5 requests -- **Max Request Size**: 100KB -- **Token Refresh**: 1 token per 3 seconds - -### Streaming Translation (`/translate/stream`) -- **Per Minute**: 10 requests -- **Per Hour**: 150 requests -- **Burst Size**: 3 requests -- **Max Request Size**: 100KB -- **Token Refresh**: 1 token per 6 seconds - -### Text-to-Speech (`/speak`) -- **Per Minute**: 15 requests -- **Per Hour**: 200 requests -- **Burst Size**: 3 requests -- **Max Request Size**: 50KB -- **Token Refresh**: 1 token per 4 seconds - -### API Endpoints -- Push notifications, error logging: Various limits (see code) - -## Global Limits - -- **Total Requests Per Minute**: 1,000 (across all endpoints) -- **Total Requests Per Hour**: 10,000 -- **Concurrent Requests**: 50 maximum - -## Rate Limiting Headers - -Successful responses include: -``` -X-RateLimit-Limit: 20 -X-RateLimit-Remaining: 15 -X-RateLimit-Reset: 1234567890 -``` - -Rate limited responses (429) include: -``` -X-RateLimit-Limit: 20 -X-RateLimit-Remaining: 0 -X-RateLimit-Reset: 1234567890 -Retry-After: 60 -``` - -## Client Identification - -Clients are identified by: -- IP address (including X-Forwarded-For support) -- User-Agent string -- Combined hash for uniqueness - -## Automatic Blocking - -IPs are temporarily blocked for 1 hour if: -- They exceed 100 requests per minute -- They repeatedly hit rate limits -- They exhibit suspicious patterns - -## Configuration - -### Environment Variables - -```bash -# No direct environment variables for rate limiting -# Configured in code - can be extended to use env vars -``` - -### Programmatic Configuration - -Rate limits can be adjusted in `rate_limiter.py`: - -```python -self.endpoint_limits = { - '/transcribe': { - 'requests_per_minute': 10, - 'requests_per_hour': 100, - 'burst_size': 3, - 'token_refresh_rate': 0.167, - 'max_request_size': 10 * 1024 * 1024 # 10MB - } -} -``` - -## Admin Endpoints - -### Get Rate Limit Configuration -```bash -curl -H "X-Admin-Token: your-admin-token" \ - http://localhost:5005/admin/rate-limits -``` - -### Get Rate Limit Statistics -```bash -# Global stats -curl -H "X-Admin-Token: your-admin-token" \ - http://localhost:5005/admin/rate-limits/stats - -# Client-specific stats -curl -H "X-Admin-Token: your-admin-token" \ - http://localhost:5005/admin/rate-limits/stats?client_id=abc123 -``` - -### Block IP Address -```bash -# Temporary block (1 hour) -curl -X POST -H "X-Admin-Token: your-admin-token" \ - -H "Content-Type: application/json" \ - -d '{"ip": "192.168.1.100", "duration": 3600}' \ - http://localhost:5005/admin/block-ip - -# Permanent block -curl -X POST -H "X-Admin-Token: your-admin-token" \ - -H "Content-Type: application/json" \ - -d '{"ip": "192.168.1.100", "permanent": true}' \ - http://localhost:5005/admin/block-ip -``` - -## Algorithm Details - -### Token Bucket -- Each client gets a bucket with configurable burst size -- Tokens regenerate at a fixed rate -- Requests consume tokens -- Empty bucket = request denied - -### Sliding Window -- Tracks requests in the last minute and hour -- More accurate than fixed windows -- Prevents gaming the system at window boundaries - -## Best Practices - -### For Users -1. Implement exponential backoff when receiving 429 errors -2. Check rate limit headers to avoid hitting limits -3. Cache responses when possible -4. Use bulk operations where available - -### For Administrators -1. Monitor rate limit statistics regularly -2. Adjust limits based on usage patterns -3. Use IP blocking sparingly -4. Set up alerts for suspicious activity - -## Error Responses - -### Rate Limited (429) -```json -{ - "error": "Rate limit exceeded (per minute)", - "retry_after": 60 -} -``` - -### Request Too Large (413) -```json -{ - "error": "Request too large" -} -``` - -### IP Blocked (429) -```json -{ - "error": "IP temporarily blocked due to excessive requests" -} -``` - -## Monitoring - -Key metrics to monitor: -- Rate limit hits by endpoint -- Blocked IPs -- Concurrent request peaks -- Request size violations -- Global limit approaches - -## Performance Impact - -- Minimal overhead (~1-2ms per request) -- Memory usage scales with active clients -- Automatic cleanup of old buckets -- Thread-safe implementation - -## Security Considerations - -1. **DoS Protection**: Prevents resource exhaustion -2. **Burst Control**: Limits sudden traffic spikes -3. **Size Validation**: Prevents large payload attacks -4. **IP Blocking**: Stops persistent attackers -5. **Global Limits**: Protects overall system capacity - -## Troubleshooting - -### "Rate limit exceeded" errors -- Check client request patterns -- Verify time synchronization -- Look for retry loops -- Check IP blocking status - -### Memory usage increasing -- Verify cleanup thread is running -- Check for client ID explosion -- Monitor bucket count - -### Legitimate users blocked -- Review rate limit settings -- Check for shared IP issues -- Implement IP whitelisting if needed \ No newline at end of file diff --git a/README.md b/README.md index d13507c..d9dd834 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,30 @@ -# Voice Language Translator +# Talk2Me - Real-Time Voice Language Translator -A mobile-friendly web application that translates spoken language between multiple languages using: -- Gemma 3 open-source LLM via Ollama for translation -- OpenAI Whisper for speech-to-text -- OpenAI Edge TTS for text-to-speech +A production-ready, mobile-friendly web application that provides real-time translation of spoken language between multiple languages. + +## Features + +- **Real-time Speech Recognition**: Powered by OpenAI Whisper with GPU acceleration +- **Advanced Translation**: Using Gemma 3 open-source LLM via Ollama +- **Natural Text-to-Speech**: OpenAI Edge TTS for lifelike voice output +- **Progressive Web App**: Full offline support with service workers +- **Multi-Speaker Support**: Track and translate conversations with multiple participants +- **Enterprise Security**: Comprehensive rate limiting, session management, and encrypted secrets +- **Production Ready**: Docker support, load balancing, and extensive monitoring + +## Table of Contents + +- [Supported Languages](#supported-languages) +- [Quick Start](#quick-start) +- [Installation](#installation) +- [Configuration](#configuration) +- [Security Features](#security-features) +- [Production Deployment](#production-deployment) +- [API Documentation](#api-documentation) +- [Development](#development) +- [Monitoring & Operations](#monitoring--operations) +- [Troubleshooting](#troubleshooting) +- [Contributing](#contributing) ## Supported Languages @@ -22,68 +43,135 @@ A mobile-friendly web application that translates spoken language between multip - Turkish - Uzbek -## Setup Instructions +## Quick Start -1. Install the required Python packages: - ``` +```bash +# Clone the repository +git clone https://github.com/yourusername/talk2me.git +cd talk2me + +# Install dependencies +pip install -r requirements.txt +npm install + +# Initialize secure configuration +python manage_secrets.py init +python manage_secrets.py set TTS_API_KEY your-api-key-here + +# Ensure Ollama is running with Gemma +ollama pull gemma2:9b +ollama pull gemma3:27b + +# Start the application +python app.py +``` + +Open your browser and navigate to `http://localhost:5005` + +## Installation + +### Prerequisites + +- Python 3.8+ +- Node.js 14+ +- Ollama (for LLM translation) +- OpenAI Edge TTS server +- Optional: NVIDIA GPU with CUDA, AMD GPU with ROCm, or Apple Silicon + +### Detailed Setup + +1. **Install Python dependencies**: + ```bash + python -m venv venv + source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt ``` -2. Configure secrets and environment: +2. **Install Node.js dependencies**: ```bash - # Initialize secure secrets management - python manage_secrets.py init + npm install + npm run build # Build TypeScript files + ``` + +3. **Configure GPU Support** (Optional): + ```bash + # For NVIDIA GPUs + pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 - # Set required secrets - python manage_secrets.py set TTS_API_KEY + # For AMD GPUs (ROCm) + pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 - # Or use traditional .env file - cp .env.example .env - nano .env + # For Apple Silicon + pip install torch torchvision torchaudio ``` + +4. **Set up Ollama**: + ```bash + # Install Ollama (https://ollama.ai) + curl -fsSL https://ollama.ai/install.sh | sh - **⚠️ Security Note**: Talk2Me includes encrypted secrets management. See [SECURITY.md](SECURITY.md) and [SECRETS_MANAGEMENT.md](SECRETS_MANAGEMENT.md) for details. - -3. Make sure you have Ollama installed and the Gemma 3 model loaded: - ``` - ollama pull gemma3 + # Pull required models + ollama pull gemma2:9b # Faster, for streaming + ollama pull gemma3:27b # Better quality ``` -4. Ensure your OpenAI Edge TTS server is running on port 5050. +5. **Configure TTS Server**: + Ensure your OpenAI Edge TTS server is running. Default expected at `http://localhost:5050` -5. Run the application: - ``` - python app.py - ``` +## Configuration -6. Open your browser and navigate to: - ``` - http://localhost:8000 - ``` +### Environment Variables -## Usage +Talk2Me uses encrypted secrets management for sensitive configuration. You can use either the secure secrets system or traditional environment variables. -1. Select your source language from the dropdown menu -2. Press the microphone button and speak -3. Press the button again to stop recording -4. Wait for the transcription to complete -5. Select your target language -6. Press the "Translate" button -7. Use the play buttons to hear the original or translated text +#### Using Secure Secrets Management (Recommended) -## Technical Details +```bash +# Initialize the secrets system +python manage_secrets.py init -- The app uses Flask for the web server -- Audio is processed client-side using the MediaRecorder API -- Whisper for speech recognition with language hints -- Ollama provides access to the Gemma 3 model for translation -- OpenAI Edge TTS delivers natural-sounding speech output +# Set required secrets +python manage_secrets.py set TTS_API_KEY +python manage_secrets.py set TTS_SERVER_URL +python manage_secrets.py set ADMIN_TOKEN -## CORS Configuration +# List all secrets +python manage_secrets.py list -The application supports Cross-Origin Resource Sharing (CORS) for secure cross-origin usage. See [CORS_CONFIG.md](CORS_CONFIG.md) for detailed configuration instructions. +# Rotate encryption keys +python manage_secrets.py rotate +``` + +#### Using Environment Variables + +Create a `.env` file: + +```env +# Core Configuration +TTS_API_KEY=your-api-key-here +TTS_SERVER_URL=http://localhost:5050/v1/audio/speech +ADMIN_TOKEN=your-secure-admin-token + +# CORS Configuration +CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com +ADMIN_CORS_ORIGINS=https://admin.yourdomain.com + +# Security Settings +SECRET_KEY=your-secret-key-here +MAX_CONTENT_LENGTH=52428800 # 50MB +SESSION_LIFETIME=3600 # 1 hour +RATE_LIMIT_STORAGE_URL=redis://localhost:6379/0 + +# Performance Tuning +WHISPER_MODEL_SIZE=base +GPU_MEMORY_THRESHOLD_MB=2048 +MEMORY_CLEANUP_INTERVAL=30 +``` + +### Advanced Configuration + +#### CORS Settings -Quick setup: ```bash # Development (allow all origins) export CORS_ORIGINS="*" @@ -93,88 +181,549 @@ export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com" export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com" ``` -## Connection Retry & Offline Support +#### Rate Limiting -Talk2Me handles network interruptions gracefully with automatic retry logic: -- Automatic request queuing during connection loss -- Exponential backoff retry with configurable parameters -- Visual connection status indicators -- Priority-based request processing +Configure per-endpoint rate limits: -See [CONNECTION_RETRY.md](CONNECTION_RETRY.md) for detailed documentation. +```python +# In your config or via admin API +RATE_LIMITS = { + 'default': {'requests_per_minute': 30, 'requests_per_hour': 500}, + 'transcribe': {'requests_per_minute': 10, 'requests_per_hour': 100}, + 'translate': {'requests_per_minute': 20, 'requests_per_hour': 300} +} +``` -## Rate Limiting +#### Session Management -Comprehensive rate limiting protects against DoS attacks and resource exhaustion: +```python +SESSION_CONFIG = { + 'max_file_size_mb': 100, + 'max_files_per_session': 100, + 'idle_timeout_minutes': 15, + 'max_lifetime_minutes': 60 +} +``` + +## Security Features + +### 1. Rate Limiting + +Comprehensive DoS protection with: - Token bucket algorithm with sliding window - Per-endpoint configurable limits - Automatic IP blocking for abusive clients -- Global request limits and concurrent request throttling - Request size validation -See [RATE_LIMITING.md](RATE_LIMITING.md) for detailed documentation. +```bash +# Check rate limit status +curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/rate-limits -## Session Management +# Block an IP +curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"ip": "192.168.1.100", "duration": 3600}' \ + http://localhost:5005/admin/block-ip +``` -Advanced session management prevents resource leaks from abandoned sessions: -- Automatic tracking of all session resources (audio files, temp files) -- Per-session resource limits (100 files, 100MB) -- Automatic cleanup of idle sessions (15 minutes) and expired sessions (1 hour) -- Real-time monitoring and metrics -- Manual cleanup capabilities for administrators +### 2. Secrets Management -See [SESSION_MANAGEMENT.md](SESSION_MANAGEMENT.md) for detailed documentation. +- AES-128 encryption for sensitive data +- Automatic key rotation +- Audit logging +- Platform-specific secure storage -## Request Size Limits +```bash +# View audit log +python manage_secrets.py audit -Comprehensive request size limiting prevents memory exhaustion: -- Global limit: 50MB for any request -- Audio files: 25MB maximum -- JSON payloads: 1MB maximum -- File type detection and enforcement -- Dynamic configuration via admin API +# Backup secrets +python manage_secrets.py export --output backup.enc -See [REQUEST_SIZE_LIMITS.md](REQUEST_SIZE_LIMITS.md) for detailed documentation. +# Restore from backup +python manage_secrets.py import --input backup.enc +``` -## Error Logging +### 3. Session Management -Production-ready error logging system for debugging and monitoring: -- Structured JSON logs for easy parsing -- Multiple log streams (app, errors, access, security, performance) -- Automatic log rotation to prevent disk exhaustion -- Request tracing with unique IDs -- Performance metrics and slow request tracking -- Admin endpoints for log analysis +- Automatic resource tracking +- Per-session limits (100 files, 100MB) +- Idle session cleanup (15 minutes) +- Real-time monitoring -See [ERROR_LOGGING.md](ERROR_LOGGING.md) for detailed documentation. +```bash +# View active sessions +curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/sessions -## Memory Management +# Clean up specific session +curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ + http://localhost:5005/admin/sessions/SESSION_ID/cleanup +``` -Comprehensive memory leak prevention for extended use: -- GPU memory management with automatic cleanup -- Whisper model reloading to prevent fragmentation -- Frontend resource tracking (audio blobs, contexts, streams) -- Automatic cleanup of temporary files -- Memory monitoring and manual cleanup endpoints +### 4. Request Size Limits -See [MEMORY_MANAGEMENT.md](MEMORY_MANAGEMENT.md) for detailed documentation. +- Global limit: 50MB +- Audio files: 25MB +- JSON payloads: 1MB +- Dynamic configuration + +```bash +# Update size limits +curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"max_audio_size": "30MB"}' \ + http://localhost:5005/admin/size-limits +``` ## Production Deployment -For production use, deploy with a proper WSGI server: -- Gunicorn with optimized worker configuration -- Nginx reverse proxy with caching -- Docker/Docker Compose support -- Systemd service management -- Comprehensive security hardening +### Docker Deployment -Quick start: ```bash +# Build and run with Docker Compose docker-compose up -d + +# Scale web workers +docker-compose up -d --scale web=4 + +# View logs +docker-compose logs -f web ``` -See [PRODUCTION_DEPLOYMENT.md](PRODUCTION_DEPLOYMENT.md) for detailed deployment instructions. +### Docker Compose Configuration -## Mobile Support +```yaml +version: '3.8' +services: + web: + build: . + ports: + - "5005:5005" + environment: + - GUNICORN_WORKERS=4 + - GUNICORN_THREADS=2 + volumes: + - ./logs:/app/logs + - whisper-cache:/root/.cache/whisper + deploy: + resources: + limits: + memory: 4G + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: [gpu] +``` -The interface is fully responsive and designed to work well on mobile devices. +### Nginx Configuration + +```nginx +upstream talk2me { + least_conn; + server web1:5005 weight=1 max_fails=3 fail_timeout=30s; + server web2:5005 weight=1 max_fails=3 fail_timeout=30s; +} + +server { + listen 443 ssl http2; + server_name talk2me.yourdomain.com; + + ssl_certificate /etc/ssl/certs/talk2me.crt; + ssl_certificate_key /etc/ssl/private/talk2me.key; + + client_max_body_size 50M; + + location / { + proxy_pass http://talk2me; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header Host $host; + + # WebSocket support + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + } + + # Cache static assets + location /static/ { + alias /app/static/; + expires 30d; + add_header Cache-Control "public, immutable"; + } +} +``` + +### Systemd Service + +```ini +[Unit] +Description=Talk2Me Translation Service +After=network.target + +[Service] +Type=notify +User=talk2me +Group=talk2me +WorkingDirectory=/opt/talk2me +Environment="PATH=/opt/talk2me/venv/bin" +ExecStart=/opt/talk2me/venv/bin/gunicorn \ + --config gunicorn_config.py \ + --bind 0.0.0.0:5005 \ + app:app +Restart=always +RestartSec=10 + +[Install] +WantedBy=multi-user.target +``` + +## API Documentation + +### Core Endpoints + +#### Transcribe Audio +```http +POST /transcribe +Content-Type: multipart/form-data + +audio: (binary) +source_lang: auto|language_code +``` + +#### Translate Text +```http +POST /translate +Content-Type: application/json + +{ + "text": "Hello world", + "source_lang": "English", + "target_lang": "Spanish" +} +``` + +#### Streaming Translation +```http +POST /translate/stream +Content-Type: application/json + +{ + "text": "Long text to translate", + "source_lang": "auto", + "target_lang": "French" +} + +Response: Server-Sent Events stream +``` + +#### Text-to-Speech +```http +POST /speak +Content-Type: application/json + +{ + "text": "Hola mundo", + "language": "Spanish" +} +``` + +### Admin Endpoints + +All admin endpoints require `X-Admin-Token` header. + +#### Health & Monitoring +- `GET /health` - Basic health check +- `GET /health/detailed` - Component status +- `GET /metrics` - Prometheus metrics +- `GET /admin/memory` - Memory usage stats + +#### Session Management +- `GET /admin/sessions` - List active sessions +- `GET /admin/sessions/:id` - Session details +- `POST /admin/sessions/:id/cleanup` - Manual cleanup + +#### Security Controls +- `GET /admin/rate-limits` - View rate limits +- `POST /admin/block-ip` - Block IP address +- `GET /admin/logs/security` - Security events + +## Development + +### TypeScript Development + +```bash +# Install dependencies +npm install + +# Development mode with auto-compilation +npm run dev + +# Build for production +npm run build + +# Type checking +npm run typecheck +``` + +### Project Structure + +``` +talk2me/ +├── app.py # Main Flask application +├── config.py # Configuration management +├── requirements.txt # Python dependencies +├── package.json # Node.js dependencies +├── tsconfig.json # TypeScript configuration +├── gunicorn_config.py # Production server config +├── docker-compose.yml # Container orchestration +├── static/ +│ ├── js/ +│ │ ├── src/ # TypeScript source files +│ │ └── dist/ # Compiled JavaScript +│ ├── css/ # Stylesheets +│ └── icons/ # PWA icons +├── templates/ # HTML templates +├── logs/ # Application logs +└── tests/ # Test suite +``` + +### Key Components + +1. **Connection Management** (`connectionManager.ts`) + - Automatic retry with exponential backoff + - Request queuing during offline periods + - Connection status monitoring + +2. **Translation Cache** (`translationCache.ts`) + - IndexedDB for offline support + - LRU eviction policy + - Automatic cache size management + +3. **Speaker Management** (`speakerManager.ts`) + - Multi-speaker conversation tracking + - Speaker-specific audio handling + - Conversation export functionality + +4. **Error Handling** (`errorBoundary.ts`) + - Global error catching + - Automatic error reporting + - User-friendly error messages + +### Running Tests + +```bash +# Python tests +pytest tests/ -v + +# TypeScript tests +npm test + +# Integration tests +python test_integration.py +``` + +## Monitoring & Operations + +### Logging System + +Talk2Me uses structured JSON logging with multiple streams: + +```bash +logs/ +├── talk2me.log # General application log +├── errors.log # Error-specific log +├── access.log # HTTP access log +├── security.log # Security events +└── performance.log # Performance metrics +``` + +View logs: +```bash +# Recent errors +tail -f logs/errors.log | jq '.' + +# Security events +grep "rate_limit_exceeded" logs/security.log | jq '.' + +# Slow requests +jq 'select(.extra_fields.duration_ms > 1000)' logs/performance.log +``` + +### Memory Management + +Talk2Me includes comprehensive memory leak prevention: + +1. **Backend Memory Management** + - GPU memory monitoring + - Automatic model reloading + - Temporary file cleanup + +2. **Frontend Memory Management** + - Audio blob cleanup + - WebRTC resource management + - Event listener cleanup + +Monitor memory: +```bash +# Check memory stats +curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/memory + +# Trigger manual cleanup +curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ + http://localhost:5005/admin/memory/cleanup +``` + +### Performance Tuning + +#### GPU Optimization + +```python +# config.py or environment +GPU_OPTIMIZATIONS = { + 'enabled': True, + 'fp16': True, # Half precision for 2x speedup + 'batch_size': 1, # Adjust based on GPU memory + 'num_workers': 2, # Parallel data loading + 'pin_memory': True # Faster GPU transfer +} +``` + +#### Whisper Optimization + +```python +TRANSCRIBE_OPTIONS = { + 'beam_size': 1, # Faster inference + 'best_of': 1, # Disable multiple attempts + 'temperature': 0, # Deterministic output + 'compression_ratio_threshold': 2.4, + 'logprob_threshold': -1.0, + 'no_speech_threshold': 0.6 +} +``` + +### Scaling Considerations + +1. **Horizontal Scaling** + - Use Redis for shared rate limiting + - Configure sticky sessions for WebSocket + - Share audio files via object storage + +2. **Vertical Scaling** + - Increase worker processes + - Tune thread pool size + - Allocate more GPU memory + +3. **Caching Strategy** + - Cache translations in Redis + - Use CDN for static assets + - Enable HTTP caching headers + +## Troubleshooting + +### Common Issues + +#### GPU Not Detected + +```bash +# Check CUDA availability +python -c "import torch; print(torch.cuda.is_available())" + +# Check GPU memory +nvidia-smi + +# For AMD GPUs +rocm-smi + +# For Apple Silicon +python -c "import torch; print(torch.backends.mps.is_available())" +``` + +#### High Memory Usage + +```bash +# Check for memory leaks +curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/health/storage + +# Manual cleanup +curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ + http://localhost:5005/admin/cleanup +``` + +#### CORS Issues + +```bash +# Test CORS configuration +curl -X OPTIONS http://localhost:5005/api/transcribe \ + -H "Origin: https://yourdomain.com" \ + -H "Access-Control-Request-Method: POST" +``` + +#### TTS Server Connection + +```bash +# Check TTS server status +curl http://localhost:5005/check_tts_server + +# Update TTS configuration +curl -X POST http://localhost:5005/update_tts_config \ + -H "Content-Type: application/json" \ + -d '{"server_url": "http://localhost:5050/v1/audio/speech", "api_key": "new-key"}' +``` + +### Debug Mode + +Enable debug logging: +```bash +export FLASK_ENV=development +export LOG_LEVEL=DEBUG +python app.py +``` + +### Performance Profiling + +```bash +# Enable performance logging +export ENABLE_PROFILING=true + +# View slow requests +jq 'select(.duration_ms > 1000)' logs/performance.log +``` + +## Contributing + +We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details. + +### Development Setup + +1. Fork the repository +2. Create a feature branch (`git checkout -b feature/amazing-feature`) +3. Make your changes +4. Run tests (`pytest && npm test`) +5. Commit your changes (`git commit -m 'Add amazing feature'`) +6. Push to the branch (`git push origin feature/amazing-feature`) +7. Open a Pull Request + +### Code Style + +- Python: Follow PEP 8 +- TypeScript: Use ESLint configuration +- Commit messages: Use conventional commits + +## License + +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. + +## Acknowledgments + +- OpenAI Whisper team for the amazing speech recognition model +- Ollama team for making LLMs accessible +- All contributors who have helped improve Talk2Me + +## Support + +- **Documentation**: Full docs at [docs.talk2me.app](https://docs.talk2me.app) +- **Issues**: [GitHub Issues](https://github.com/yourusername/talk2me/issues) +- **Discussions**: [GitHub Discussions](https://github.com/yourusername/talk2me/discussions) +- **Security**: Please report security vulnerabilities to security@talk2me.app \ No newline at end of file diff --git a/README_TYPESCRIPT.md b/README_TYPESCRIPT.md deleted file mode 100644 index 1ec408a..0000000 --- a/README_TYPESCRIPT.md +++ /dev/null @@ -1,54 +0,0 @@ -# TypeScript Setup for Talk2Me - -This project now includes TypeScript support for better type safety and developer experience. - -## Installation - -1. Install Node.js dependencies: -```bash -npm install -``` - -2. Build TypeScript files: -```bash -npm run build -``` - -## Development - -For development with automatic recompilation: -```bash -npm run watch -# or -npm run dev -``` - -## Project Structure - -- `/static/js/src/` - TypeScript source files - - `app.ts` - Main application logic - - `types.ts` - Type definitions -- `/static/js/dist/` - Compiled JavaScript files (git-ignored) -- `tsconfig.json` - TypeScript configuration -- `package.json` - Node.js dependencies and scripts - -## Available Scripts - -- `npm run build` - Compile TypeScript to JavaScript -- `npm run watch` - Watch for changes and recompile -- `npm run dev` - Same as watch -- `npm run clean` - Remove compiled files -- `npm run type-check` - Type-check without compiling - -## Type Safety Benefits - -The TypeScript implementation provides: -- Compile-time type checking -- Better IDE support with autocomplete -- Explicit interface definitions for API responses -- Safer refactoring -- Self-documenting code - -## Next Steps - -After building, the compiled JavaScript will be in `/static/js/dist/app.js` and will be automatically loaded by the HTML template. \ No newline at end of file diff --git a/REQUEST_SIZE_LIMITS.md b/REQUEST_SIZE_LIMITS.md deleted file mode 100644 index b97481e..0000000 --- a/REQUEST_SIZE_LIMITS.md +++ /dev/null @@ -1,332 +0,0 @@ -# Request Size Limits Documentation - -This document describes the request size limiting system implemented in Talk2Me to prevent memory exhaustion from large uploads. - -## Overview - -Talk2Me implements comprehensive request size limiting to protect against: -- Memory exhaustion from large file uploads -- Denial of Service (DoS) attacks using oversized requests -- Buffer overflow attempts -- Resource starvation from unbounded requests - -## Default Limits - -### Global Limits -- **Maximum Content Length**: 50MB - Absolute maximum for any request -- **Maximum Audio File Size**: 25MB - For audio uploads (transcription) -- **Maximum JSON Payload**: 1MB - For API requests -- **Maximum Image Size**: 10MB - For future image processing features -- **Maximum Chunk Size**: 1MB - For streaming uploads - -## Features - -### 1. Multi-Layer Protection - -The system implements multiple layers of size checking: -- Flask's built-in `MAX_CONTENT_LENGTH` configuration -- Pre-request validation before data is loaded into memory -- File-type specific limits -- Endpoint-specific limits -- Streaming request monitoring - -### 2. File Type Detection - -Automatic detection and enforcement based on file extensions: -- Audio files: `.wav`, `.mp3`, `.ogg`, `.webm`, `.m4a`, `.flac`, `.aac` -- Image files: `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp` -- JSON payloads: Content-Type header detection - -### 3. Graceful Error Handling - -When limits are exceeded: -- Returns 413 (Request Entity Too Large) status code -- Provides clear error messages with size information -- Includes both actual and allowed sizes -- Human-readable size formatting - -## Configuration - -### Environment Variables - -```bash -# Set limits via environment variables (in bytes) -export MAX_CONTENT_LENGTH=52428800 # 50MB -export MAX_AUDIO_SIZE=26214400 # 25MB -export MAX_JSON_SIZE=1048576 # 1MB -export MAX_IMAGE_SIZE=10485760 # 10MB -``` - -### Flask Configuration - -```python -# In config.py or app.py -app.config.update({ - 'MAX_CONTENT_LENGTH': 50 * 1024 * 1024, # 50MB - 'MAX_AUDIO_SIZE': 25 * 1024 * 1024, # 25MB - 'MAX_JSON_SIZE': 1 * 1024 * 1024, # 1MB - 'MAX_IMAGE_SIZE': 10 * 1024 * 1024 # 10MB -}) -``` - -### Dynamic Configuration - -Size limits can be updated at runtime via admin API. - -## API Endpoints - -### GET /admin/size-limits -Get current size limits. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/size-limits -``` - -Response: -```json -{ - "limits": { - "max_content_length": 52428800, - "max_audio_size": 26214400, - "max_json_size": 1048576, - "max_image_size": 10485760 - }, - "limits_human": { - "max_content_length": "50.0MB", - "max_audio_size": "25.0MB", - "max_json_size": "1.0MB", - "max_image_size": "10.0MB" - } -} -``` - -### POST /admin/size-limits -Update size limits dynamically. - -```bash -curl -X POST -H "X-Admin-Token: your-token" \ - -H "Content-Type: application/json" \ - -d '{"max_audio_size": "30MB", "max_json_size": 2097152}' \ - http://localhost:5005/admin/size-limits -``` - -Response: -```json -{ - "success": true, - "old_limits": {...}, - "new_limits": {...}, - "new_limits_human": { - "max_audio_size": "30.0MB", - "max_json_size": "2.0MB" - } -} -``` - -## Usage Examples - -### 1. Endpoint-Specific Limits - -```python -@app.route('/upload') -@limit_request_size(max_size=10*1024*1024) # 10MB limit -def upload(): - # Handle upload - pass - -@app.route('/upload-audio') -@limit_request_size(max_audio_size=30*1024*1024) # 30MB for audio -def upload_audio(): - # Handle audio upload - pass -``` - -### 2. Client-Side Validation - -```javascript -// Check file size before upload -const MAX_AUDIO_SIZE = 25 * 1024 * 1024; // 25MB - -function validateAudioFile(file) { - if (file.size > MAX_AUDIO_SIZE) { - alert(`Audio file too large. Maximum size is ${MAX_AUDIO_SIZE / 1024 / 1024}MB`); - return false; - } - return true; -} -``` - -### 3. Chunked Uploads (Future Enhancement) - -```javascript -// For files larger than limits, use chunked upload -async function uploadLargeFile(file, chunkSize = 1024 * 1024) { - const chunks = Math.ceil(file.size / chunkSize); - - for (let i = 0; i < chunks; i++) { - const start = i * chunkSize; - const end = Math.min(start + chunkSize, file.size); - const chunk = file.slice(start, end); - - await uploadChunk(chunk, i, chunks); - } -} -``` - -## Error Responses - -### 413 Request Entity Too Large - -When a request exceeds size limits: - -```json -{ - "error": "Request too large", - "max_size": 52428800, - "your_size": 75000000, - "max_size_mb": 50.0 -} -``` - -### File-Specific Errors - -For audio files: -```json -{ - "error": "Audio file too large", - "max_size": 26214400, - "your_size": 35000000, - "max_size_mb": 25.0 -} -``` - -For JSON payloads: -```json -{ - "error": "JSON payload too large", - "max_size": 1048576, - "your_size": 2000000, - "max_size_kb": 1024.0 -} -``` - -## Best Practices - -### 1. Client-Side Validation - -Always validate file sizes on the client side: -```javascript -// Add to static/js/app.js -const SIZE_LIMITS = { - audio: 25 * 1024 * 1024, // 25MB - json: 1 * 1024 * 1024, // 1MB -}; - -function checkFileSize(file, type) { - const limit = SIZE_LIMITS[type]; - if (file.size > limit) { - showError(`File too large. Maximum size: ${formatSize(limit)}`); - return false; - } - return true; -} -``` - -### 2. Progressive Enhancement - -For better UX with large files: -- Show upload progress -- Implement resumable uploads -- Compress audio client-side when possible -- Use appropriate audio formats (WebM/Opus for smaller sizes) - -### 3. Server Configuration - -Configure your web server (Nginx/Apache) to also enforce limits: - -**Nginx:** -```nginx -client_max_body_size 50M; -client_body_buffer_size 1M; -``` - -**Apache:** -```apache -LimitRequestBody 52428800 -``` - -### 4. Monitoring - -Monitor size limit violations: -- Track 413 errors in logs -- Alert on repeated violations from same IP -- Adjust limits based on usage patterns - -## Security Considerations - -1. **Memory Protection**: Pre-flight size checks prevent loading large files into memory -2. **DoS Prevention**: Limits prevent attackers from exhausting server resources -3. **Bandwidth Protection**: Prevents bandwidth exhaustion from large uploads -4. **Storage Protection**: Works with session management to limit total storage per user - -## Integration with Other Systems - -### Rate Limiting -Size limits work in conjunction with rate limiting: -- Large requests count more against rate limits -- Repeated size violations can trigger IP blocking - -### Session Management -Size limits are enforced per session: -- Total storage per session is limited -- Large files count against session resource limits - -### Monitoring -Size limit violations are tracked in: -- Application logs -- Health check endpoints -- Admin monitoring dashboards - -## Troubleshooting - -### Common Issues - -#### 1. Legitimate Large Files Rejected - -If users need to upload larger files: -```bash -# Increase limit for audio files to 50MB -curl -X POST -H "X-Admin-Token: token" \ - -d '{"max_audio_size": "50MB"}' \ - http://localhost:5005/admin/size-limits -``` - -#### 2. Chunked Transfer Encoding - -For requests without Content-Length header: -- The system monitors the stream -- Terminates connection if size exceeded -- May require special handling for some clients - -#### 3. Load Balancer Limits - -Ensure your load balancer also enforces appropriate limits: -- AWS ALB: Configure request size limits -- Cloudflare: Set upload size limits -- Nginx: Configure client_max_body_size - -## Performance Impact - -The size limiting system has minimal performance impact: -- Pre-flight checks are O(1) operations -- No buffering of large requests -- Early termination of oversized requests -- Efficient memory usage - -## Future Enhancements - -1. **Chunked Upload Support**: Native support for resumable uploads -2. **Compression Detection**: Automatic handling of compressed uploads -3. **Dynamic Limits**: Per-user or per-tier size limits -4. **Bandwidth Throttling**: Rate limit large uploads -5. **Storage Quotas**: Long-term storage limits per user \ No newline at end of file diff --git a/SECRETS_MANAGEMENT.md b/SECRETS_MANAGEMENT.md deleted file mode 100644 index b4b0ce7..0000000 --- a/SECRETS_MANAGEMENT.md +++ /dev/null @@ -1,411 +0,0 @@ -# Secrets Management Documentation - -This document describes the secure secrets management system implemented in Talk2Me. - -## Overview - -Talk2Me uses a comprehensive secrets management system that provides: -- Encrypted storage of sensitive configuration -- Secret rotation capabilities -- Audit logging -- Integrity verification -- CLI management tools -- Environment variable integration - -## Architecture - -### Components - -1. **SecretsManager** (`secrets_manager.py`) - - Handles encryption/decryption using Fernet (AES-128) - - Manages secret lifecycle (create, read, update, delete) - - Provides audit logging - - Supports secret rotation - -2. **Configuration System** (`config.py`) - - Integrates secrets with Flask configuration - - Environment-specific configurations - - Validation and sanitization - -3. **CLI Tool** (`manage_secrets.py`) - - Command-line interface for secret management - - Interactive and scriptable - -### Security Features - -- **Encryption**: AES-128 encryption using cryptography.fernet -- **Key Derivation**: PBKDF2 with SHA256 (100,000 iterations) -- **Master Key**: Stored separately with restricted permissions -- **Audit Trail**: All access and modifications logged -- **Integrity Checks**: Verify secrets haven't been tampered with - -## Quick Start - -### 1. Initialize Secrets - -```bash -python manage_secrets.py init -``` - -This will: -- Generate a master encryption key -- Create initial secrets (Flask secret key, admin token) -- Prompt for required secrets (TTS API key) - -### 2. Set a Secret - -```bash -# Interactive (hidden input) -python manage_secrets.py set TTS_API_KEY - -# Direct (be careful with shell history) -python manage_secrets.py set TTS_API_KEY --value "your-api-key" - -# With metadata -python manage_secrets.py set API_KEY --value "key" --metadata '{"service": "external-api"}' -``` - -### 3. List Secrets - -```bash -python manage_secrets.py list -``` - -Output: -``` -Key Created Last Rotated Has Value -------------------------------------------------------------------------------------- -FLASK_SECRET_KEY 2024-01-15 2024-01-20 ✓ -TTS_API_KEY 2024-01-15 Never ✓ -ADMIN_TOKEN 2024-01-15 2024-01-18 ✓ -``` - -### 4. Rotate Secrets - -```bash -# Rotate a specific secret -python manage_secrets.py rotate ADMIN_TOKEN - -# Check which secrets need rotation -python manage_secrets.py check-rotation - -# Schedule automatic rotation -python manage_secrets.py schedule-rotation API_KEY 30 # Every 30 days -``` - -## Configuration - -### Environment Variables - -The secrets manager checks these locations in order: -1. Encrypted secrets storage (`.secrets.json`) -2. `SECRET_` environment variable -3. `` environment variable -4. Default value - -### Master Key - -The master encryption key is loaded from: -1. `MASTER_KEY` environment variable -2. `.master_key` file (default) -3. Auto-generated if neither exists - -**Important**: Protect the master key! -- Set file permissions: `chmod 600 .master_key` -- Back it up securely -- Never commit to version control - -### Flask Integration - -Secrets are automatically loaded into Flask configuration: - -```python -# In app.py -from config import init_app as init_config -from secrets_manager import init_app as init_secrets - -app = Flask(__name__) -init_config(app) -init_secrets(app) - -# Access secrets -api_key = app.config['TTS_API_KEY'] -``` - -## CLI Commands - -### Basic Operations - -```bash -# List all secrets -python manage_secrets.py list - -# Get a secret value (requires confirmation) -python manage_secrets.py get TTS_API_KEY - -# Set a secret -python manage_secrets.py set DATABASE_URL - -# Delete a secret -python manage_secrets.py delete OLD_API_KEY - -# Rotate a secret -python manage_secrets.py rotate ADMIN_TOKEN -``` - -### Advanced Operations - -```bash -# Verify integrity of all secrets -python manage_secrets.py verify - -# Migrate from environment variables -python manage_secrets.py migrate - -# View audit log -python manage_secrets.py audit -python manage_secrets.py audit TTS_API_KEY --limit 50 - -# Schedule rotation -python manage_secrets.py schedule-rotation API_KEY 90 -``` - -## Security Best Practices - -### 1. File Permissions - -```bash -# Secure the secrets files -chmod 600 .secrets.json -chmod 600 .master_key -``` - -### 2. Backup Strategy - -- Back up `.master_key` separately from `.secrets.json` -- Store backups in different secure locations -- Test restore procedures regularly - -### 3. Rotation Policy - -Recommended rotation intervals: -- API Keys: 90 days -- Admin Tokens: 30 days -- Database Passwords: 180 days -- Encryption Keys: 365 days - -### 4. Access Control - -- Use environment-specific secrets -- Implement least privilege access -- Audit secret access regularly - -### 5. Git Security - -Ensure these files are in `.gitignore`: -``` -.secrets.json -.master_key -secrets.db -*.key -``` - -## Deployment - -### Development - -```bash -# Use .env file for convenience -cp .env.example .env -# Edit .env with development values - -# Initialize secrets -python manage_secrets.py init -``` - -### Production - -```bash -# Set master key via environment -export MASTER_KEY="your-production-master-key" - -# Or use a key management service -export MASTER_KEY_FILE="/secure/path/to/master.key" - -# Load secrets from secure storage -python manage_secrets.py set TTS_API_KEY --value "$TTS_API_KEY" -python manage_secrets.py set ADMIN_TOKEN --value "$ADMIN_TOKEN" -``` - -### Docker - -```dockerfile -# Dockerfile -FROM python:3.9 - -# Copy encrypted secrets (not the master key!) -COPY .secrets.json /app/.secrets.json - -# Master key provided at runtime -ENV MASTER_KEY="" - -# Run with: -# docker run -e MASTER_KEY="$MASTER_KEY" myapp -``` - -### Kubernetes - -```yaml -# secret.yaml -apiVersion: v1 -kind: Secret -metadata: - name: talk2me-master-key -type: Opaque -stringData: - master-key: "your-master-key" - ---- -# deployment.yaml -apiVersion: apps/v1 -kind: Deployment -spec: - template: - spec: - containers: - - name: talk2me - env: - - name: MASTER_KEY - valueFrom: - secretKeyRef: - name: talk2me-master-key - key: master-key -``` - -## Troubleshooting - -### Lost Master Key - -If you lose the master key: -1. You'll need to recreate all secrets -2. Generate new master key: `python manage_secrets.py init` -3. Re-enter all secret values - -### Corrupted Secrets File - -```bash -# Check integrity -python manage_secrets.py verify - -# If corrupted, restore from backup or reinitialize -``` - -### Permission Errors - -```bash -# Fix file permissions -chmod 600 .secrets.json .master_key -chown $USER:$USER .secrets.json .master_key -``` - -## Monitoring - -### Audit Logs - -Review secret access patterns: -```bash -# View all audit entries -python manage_secrets.py audit - -# Check specific secret -python manage_secrets.py audit TTS_API_KEY - -# Export for analysis -python manage_secrets.py audit > audit.log -``` - -### Rotation Monitoring - -```bash -# Check rotation status -python manage_secrets.py check-rotation - -# Set up cron job for automatic checks -0 0 * * * /path/to/python /path/to/manage_secrets.py check-rotation -``` - -## Migration Guide - -### From Environment Variables - -```bash -# Automatic migration -python manage_secrets.py migrate - -# Manual migration -export OLD_API_KEY="your-key" -python manage_secrets.py set API_KEY --value "$OLD_API_KEY" -unset OLD_API_KEY -``` - -### From .env Files - -```python -# migrate_env.py -from dotenv import dotenv_values -from secrets_manager import get_secrets_manager - -env_values = dotenv_values('.env') -manager = get_secrets_manager() - -for key, value in env_values.items(): - if key.endswith('_KEY') or key.endswith('_TOKEN'): - manager.set(key, value, {'migrated_from': '.env'}) -``` - -## API Reference - -### Python API - -```python -from secrets_manager import get_secret, set_secret - -# Get a secret -api_key = get_secret('TTS_API_KEY', default='') - -# Set a secret -set_secret('NEW_API_KEY', 'value', metadata={'service': 'external'}) - -# Advanced usage -from secrets_manager import get_secrets_manager - -manager = get_secrets_manager() -manager.rotate('API_KEY') -manager.schedule_rotation('TOKEN', days=30) -``` - -### Flask CLI - -```bash -# Via Flask CLI -flask secrets-list -flask secrets-set -flask secrets-rotate -flask secrets-check-rotation -``` - -## Security Considerations - -1. **Never log secret values** -2. **Use secure random generation for new secrets** -3. **Implement proper access controls** -4. **Regular security audits** -5. **Incident response plan for compromised secrets** - -## Future Enhancements - -- Integration with cloud KMS (AWS, Azure, GCP) -- Hardware security module (HSM) support -- Secret sharing (Shamir's Secret Sharing) -- Time-based access controls -- Automated compliance reporting \ No newline at end of file diff --git a/SECURITY.md b/SECURITY.md deleted file mode 100644 index 8513d3d..0000000 --- a/SECURITY.md +++ /dev/null @@ -1,173 +0,0 @@ -# Security Configuration Guide - -This document outlines security best practices for deploying Talk2Me. - -## Secrets Management - -Talk2Me includes a comprehensive secrets management system with encryption, rotation, and audit logging. - -### Quick Start - -```bash -# Initialize secrets management -python manage_secrets.py init - -# Set a secret -python manage_secrets.py set TTS_API_KEY - -# List secrets -python manage_secrets.py list - -# Rotate secrets -python manage_secrets.py rotate ADMIN_TOKEN -``` - -See [SECRETS_MANAGEMENT.md](SECRETS_MANAGEMENT.md) for detailed documentation. - -## Environment Variables - -**NEVER commit sensitive information like API keys, passwords, or secrets to version control.** - -### Required Security Configuration - -1. **TTS_API_KEY** - - Required for TTS server authentication - - Set via environment variable: `export TTS_API_KEY="your-api-key"` - - Or use a `.env` file (see `.env.example`) - -2. **SECRET_KEY** - - Required for Flask session security - - Generate a secure key: `python -c "import secrets; print(secrets.token_hex(32))"` - - Set via: `export SECRET_KEY="your-generated-key"` - -3. **ADMIN_TOKEN** - - Required for admin endpoints - - Generate a secure token: `python -c "import secrets; print(secrets.token_urlsafe(32))"` - - Set via: `export ADMIN_TOKEN="your-admin-token"` - -### Using a .env File (Recommended) - -1. Copy the example file: - ```bash - cp .env.example .env - ``` - -2. Edit `.env` with your actual values: - ```bash - nano .env # or your preferred editor - ``` - -3. Load environment variables: - ```bash - # Using python-dotenv (add to requirements.txt) - pip install python-dotenv - - # Or source manually - source .env - ``` - -### Python-dotenv Integration - -To automatically load `.env` files, add this to the top of `app.py`: - -```python -from dotenv import load_dotenv -load_dotenv() # Load .env file if it exists -``` - -### Production Deployment - -For production deployments: - -1. **Use a secrets management service**: - - AWS Secrets Manager - - HashiCorp Vault - - Azure Key Vault - - Google Secret Manager - -2. **Set environment variables securely**: - - Use your platform's environment configuration - - Never expose secrets in logs or error messages - - Rotate keys regularly - -3. **Additional security measures**: - - Use HTTPS only - - Enable CORS restrictions - - Implement rate limiting - - Monitor for suspicious activity - -### Docker Deployment - -When using Docker: - -```dockerfile -# Use build arguments for non-sensitive config -ARG TTS_SERVER_URL=http://localhost:5050/v1/audio/speech - -# Use runtime environment for secrets -ENV TTS_API_KEY="" -``` - -Run with: -```bash -docker run -e TTS_API_KEY="your-key" -e SECRET_KEY="your-secret" talk2me -``` - -### Kubernetes Deployment - -Use Kubernetes secrets: - -```yaml -apiVersion: v1 -kind: Secret -metadata: - name: talk2me-secrets -type: Opaque -stringData: - tts-api-key: "your-api-key" - flask-secret-key: "your-secret-key" - admin-token: "your-admin-token" -``` - -### Rate Limiting - -Talk2Me implements comprehensive rate limiting to prevent abuse: - -1. **Per-Endpoint Limits**: - - Transcription: 10/min, 100/hour - - Translation: 20/min, 300/hour - - TTS: 15/min, 200/hour - -2. **Global Limits**: - - 1,000 requests/minute total - - 50 concurrent requests maximum - -3. **Automatic Protection**: - - IP blocking for excessive requests - - Request size validation - - Burst control - -See [RATE_LIMITING.md](RATE_LIMITING.md) for configuration details. - -### Security Checklist - -- [ ] All API keys removed from source code -- [ ] Environment variables configured -- [ ] `.env` file added to `.gitignore` -- [ ] Secrets rotated after any potential exposure -- [ ] HTTPS enabled in production -- [ ] CORS properly configured -- [ ] Rate limiting enabled and configured -- [ ] Admin endpoints protected with authentication -- [ ] Error messages don't expose sensitive info -- [ ] Logs sanitized of sensitive data -- [ ] Request size limits enforced -- [ ] IP blocking configured for abuse prevention - -### Reporting Security Issues - -If you discover a security vulnerability, please report it to: -- Create a private security advisory on GitHub -- Or email: security@yourdomain.com - -Do not create public issues for security vulnerabilities. \ No newline at end of file diff --git a/SESSION_MANAGEMENT.md b/SESSION_MANAGEMENT.md deleted file mode 100644 index 1897a01..0000000 --- a/SESSION_MANAGEMENT.md +++ /dev/null @@ -1,366 +0,0 @@ -# Session Management Documentation - -This document describes the session management system implemented in Talk2Me to prevent resource leaks from abandoned sessions. - -## Overview - -Talk2Me implements a comprehensive session management system that tracks user sessions and associated resources (audio files, temporary files, streams) to ensure proper cleanup and prevent resource exhaustion. - -## Features - -### 1. Automatic Resource Tracking - -All resources created during a user session are automatically tracked: -- Audio files (uploads and generated) -- Temporary files -- Active streams -- Resource metadata (size, creation time, purpose) - -### 2. Resource Limits - -Per-session limits prevent resource exhaustion: -- Maximum resources per session: 100 -- Maximum storage per session: 100MB -- Automatic cleanup of oldest resources when limits are reached - -### 3. Session Lifecycle Management - -Sessions are automatically managed: -- Created on first request -- Updated on each request -- Cleaned up when idle (15 minutes) -- Removed when expired (1 hour) - -### 4. Automatic Cleanup - -Background cleanup processes run automatically: -- Idle session cleanup (every minute) -- Expired session cleanup (every minute) -- Orphaned file cleanup (every minute) - -## Configuration - -Session management can be configured via environment variables or Flask config: - -```python -# app.py or config.py -app.config.update({ - 'MAX_SESSION_DURATION': 3600, # 1 hour - 'MAX_SESSION_IDLE_TIME': 900, # 15 minutes - 'MAX_RESOURCES_PER_SESSION': 100, - 'MAX_BYTES_PER_SESSION': 104857600, # 100MB - 'SESSION_CLEANUP_INTERVAL': 60, # 1 minute - 'SESSION_STORAGE_PATH': '/path/to/sessions' -}) -``` - -## API Endpoints - -### Admin Endpoints - -All admin endpoints require authentication via `X-Admin-Token` header. - -#### GET /admin/sessions -Get information about all active sessions. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions -``` - -Response: -```json -{ - "sessions": [ - { - "session_id": "uuid", - "user_id": null, - "ip_address": "192.168.1.1", - "created_at": "2024-01-15T10:00:00", - "last_activity": "2024-01-15T10:05:00", - "duration_seconds": 300, - "idle_seconds": 0, - "request_count": 5, - "resource_count": 3, - "total_bytes_used": 1048576, - "resources": [...] - } - ], - "stats": { - "total_sessions_created": 100, - "total_sessions_cleaned": 50, - "active_sessions": 5, - "avg_session_duration": 600, - "avg_resources_per_session": 4.2 - } -} -``` - -#### GET /admin/sessions/{session_id} -Get detailed information about a specific session. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/abc123 -``` - -#### POST /admin/sessions/{session_id}/cleanup -Manually cleanup a specific session. - -```bash -curl -X POST -H "X-Admin-Token: your-token" \ - http://localhost:5005/admin/sessions/abc123/cleanup -``` - -#### GET /admin/sessions/metrics -Get session management metrics for monitoring. - -```bash -curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/metrics -``` - -Response: -```json -{ - "sessions": { - "active": 5, - "total_created": 100, - "total_cleaned": 95 - }, - "resources": { - "active": 20, - "total_cleaned": 380, - "active_bytes": 10485760, - "total_bytes_cleaned": 1073741824 - }, - "limits": { - "max_session_duration": 3600, - "max_idle_time": 900, - "max_resources_per_session": 100, - "max_bytes_per_session": 104857600 - } -} -``` - -## CLI Commands - -Session management can be controlled via Flask CLI commands: - -```bash -# List all active sessions -flask sessions-list - -# Manual cleanup -flask sessions-cleanup - -# Show statistics -flask sessions-stats -``` - -## Usage Examples - -### 1. Monitor Active Sessions - -```python -import requests - -headers = {'X-Admin-Token': 'your-admin-token'} -response = requests.get('http://localhost:5005/admin/sessions', headers=headers) -sessions = response.json() - -for session in sessions['sessions']: - print(f"Session {session['session_id']}:") - print(f" IP: {session['ip_address']}") - print(f" Resources: {session['resource_count']}") - print(f" Storage: {session['total_bytes_used'] / 1024 / 1024:.2f} MB") -``` - -### 2. Cleanup Idle Sessions - -```python -# Get all sessions -response = requests.get('http://localhost:5005/admin/sessions', headers=headers) -sessions = response.json()['sessions'] - -# Find idle sessions -idle_threshold = 300 # 5 minutes -for session in sessions: - if session['idle_seconds'] > idle_threshold: - # Cleanup idle session - cleanup_url = f'http://localhost:5005/admin/sessions/{session["session_id"]}/cleanup' - requests.post(cleanup_url, headers=headers) - print(f"Cleaned up idle session {session['session_id']}") -``` - -### 3. Monitor Resource Usage - -```python -# Get metrics -response = requests.get('http://localhost:5005/admin/sessions/metrics', headers=headers) -metrics = response.json() - -print(f"Active sessions: {metrics['sessions']['active']}") -print(f"Active resources: {metrics['resources']['active']}") -print(f"Storage used: {metrics['resources']['active_bytes'] / 1024 / 1024:.2f} MB") -print(f"Total cleaned: {metrics['resources']['total_bytes_cleaned'] / 1024 / 1024 / 1024:.2f} GB") -``` - -## Resource Types - -The session manager tracks different types of resources: - -### 1. Audio Files -- Uploaded audio files for transcription -- Generated audio files from TTS -- Automatically cleaned up after session ends - -### 2. Temporary Files -- Processing intermediates -- Cache files -- Automatically cleaned up after use - -### 3. Streams -- WebSocket connections -- Server-sent event streams -- Closed when session ends - -## Best Practices - -### 1. Session Configuration - -```python -# Development -app.config.update({ - 'MAX_SESSION_DURATION': 7200, # 2 hours - 'MAX_SESSION_IDLE_TIME': 1800, # 30 minutes - 'MAX_RESOURCES_PER_SESSION': 200, - 'MAX_BYTES_PER_SESSION': 209715200 # 200MB -}) - -# Production -app.config.update({ - 'MAX_SESSION_DURATION': 3600, # 1 hour - 'MAX_SESSION_IDLE_TIME': 900, # 15 minutes - 'MAX_RESOURCES_PER_SESSION': 100, - 'MAX_BYTES_PER_SESSION': 104857600 # 100MB -}) -``` - -### 2. Monitoring - -Set up monitoring for: -- Number of active sessions -- Resource usage per session -- Cleanup frequency -- Failed cleanup attempts - -### 3. Alerting - -Configure alerts for: -- High number of active sessions (>1000) -- High resource usage (>80% of limits) -- Failed cleanup operations -- Orphaned files detected - -## Troubleshooting - -### Common Issues - -#### 1. Sessions Not Being Cleaned Up - -Check cleanup thread status: -```bash -flask sessions-stats -``` - -Manual cleanup: -```bash -flask sessions-cleanup -``` - -#### 2. Resource Limits Reached - -Check session details: -```bash -curl -H "X-Admin-Token: token" http://localhost:5005/admin/sessions/SESSION_ID -``` - -Increase limits if needed: -```python -app.config['MAX_RESOURCES_PER_SESSION'] = 200 -app.config['MAX_BYTES_PER_SESSION'] = 209715200 # 200MB -``` - -#### 3. Orphaned Files - -Check for orphaned files: -```bash -ls -la /path/to/session/storage/ -``` - -Clean orphaned files: -```bash -flask sessions-cleanup -``` - -### Debug Logging - -Enable debug logging for session management: - -```python -import logging - -# Enable session manager debug logs -logging.getLogger('session_manager').setLevel(logging.DEBUG) -``` - -## Security Considerations - -1. **Session Hijacking**: Sessions are tied to IP addresses and user agents -2. **Resource Exhaustion**: Strict per-session limits prevent DoS attacks -3. **File System Access**: Session storage uses secure paths and permissions -4. **Admin Access**: All admin endpoints require authentication - -## Performance Impact - -The session management system has minimal performance impact: -- Memory: ~1KB per session + resource metadata -- CPU: Background cleanup runs every minute -- Disk I/O: Cleanup operations are batched -- Network: No external dependencies - -## Integration with Other Systems - -### Rate Limiting - -Session management integrates with rate limiting: -```python -# Sessions are automatically tracked per IP -# Rate limits apply per session -``` - -### Secrets Management - -Session tokens can be encrypted: -```python -from secrets_manager import encrypt_value -encrypted_session = encrypt_value(session_id) -``` - -### Monitoring - -Export metrics to monitoring systems: -```python -# Prometheus format -@app.route('/metrics') -def prometheus_metrics(): - metrics = app.session_manager.export_metrics() - # Format as Prometheus metrics - return format_prometheus(metrics) -``` - -## Future Enhancements - -1. **Session Persistence**: Store sessions in Redis/database -2. **Distributed Sessions**: Support for multi-server deployments -3. **Session Analytics**: Track usage patterns and trends -4. **Resource Quotas**: Per-user resource quotas -5. **Session Replay**: Debug issues by replaying sessions \ No newline at end of file