Housekeeping: Remove unnecessary test and temporary files

- Removed test scripts (test_*.py, tts-debug-script.py) - Removed test output files (tts_test_output.mp3, test-cors.html) - Removed redundant static/js/app.js (using TypeScript dist/ instead) - Removed outdated setup-script.sh - Removed Python cache directory (__pycache__) - Removed Claude IDE local settings (.claude/) - Updated .gitignore with better patterns for: - Test files - Debug scripts - Claude IDE settings - Standalone compiled JS This cleanup reduces repository size and removes temporary/debug files that shouldn't be version controlled. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Add multi-GPU support for Docker deployments
2025-06-03 09:24:44 -06:00 · 2025-06-03 09:16:41 -06:00 · 2025-06-03 09:10:58 -06:00
25 changed files with 797 additions and 6290 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -67,3 +67,15 @@ vapid_public.pem
 .master_key
 secrets.db
 *.key
+
+# Test files
+test_*.py
+*_test_output.*
+test-*.html
+*-debug-script.py
+
+# Claude IDE
+.claude/
+
+# Standalone compiled JS (use dist/ instead)
+static/js/app.js
--- a/CONNECTION_RETRY.md
+++ b/CONNECTION_RETRY.md
@@ -1,173 +0,0 @@
-# Connection Retry Logic Documentation
-
-This document explains the connection retry and network interruption handling features in Talk2Me.
-
-## Overview
-
-Talk2Me implements robust connection retry logic to handle network interruptions gracefully. When a connection is lost or a request fails due to network issues, the application automatically queues requests and retries them when the connection is restored.
-
-## Features
-
-### 1. Automatic Connection Monitoring
- Monitors browser online/offline events
- Periodic health checks to the server (every 5 seconds when offline)
- Visual connection status indicator
- Automatic detection when returning from sleep/hibernation
-
-### 2. Request Queuing
- Failed requests are automatically queued during network interruptions
- Requests maintain their priority and are processed in order
- Queue persists across connection failures
- Visual indication of queued requests
-
-### 3. Exponential Backoff Retry
- Failed requests are retried with exponential backoff
- Initial retry delay: 1 second
- Maximum retry delay: 30 seconds
- Backoff multiplier: 2x
- Maximum retries: 3 attempts
-
-### 4. Connection Status UI
- Real-time connection status indicator (bottom-right corner)
- Offline banner with retry button
- Queue status showing pending requests by type
- Temporary status messages for important events
-
-## User Experience
-
-### When Connection is Lost
-
-1. **Visual Indicators**:
-   - Connection status shows "Offline" or "Connection error"
-   - Red banner appears at top of screen
-   - Queued request count is displayed
-
-2. **Request Handling**:
-   - New requests are automatically queued
-   - User sees "Connection error - queued" message
-   - Requests will be sent when connection returns
-
-3. **Manual Retry**:
-   - Users can click "Retry" button in offline banner
-   - Forces immediate connection check
-
-### When Connection is Restored
-
-1. **Automatic Recovery**:
-   - Connection status changes to "Connecting..."
-   - Queued requests are processed automatically
-   - Success message shown briefly
-
-2. **Request Processing**:
-   - Queued requests maintain their order
-   - Higher priority requests (transcription) processed first
-   - Progress indicators show processing status
-
-## Configuration
-
-The connection retry logic can be configured programmatically:
-
-```javascript
-// In app.ts or initialization code
-connectionManager.configure({
-    maxRetries: 3,           // Maximum retry attempts
-    initialDelay: 1000,      // Initial retry delay (ms)
-    maxDelay: 30000,         // Maximum retry delay (ms)
-    backoffMultiplier: 2,    // Exponential backoff multiplier
-    timeout: 10000,          // Request timeout (ms)
-    onlineCheckInterval: 5000 // Health check interval (ms)
-});
-```
-
-## Request Priority
-
-Requests are prioritized as follows:
-1. **Transcription** (Priority: 8) - Highest priority
-2. **Translation** (Priority: 5) - Normal priority
-3. **TTS/Audio** (Priority: 3) - Lower priority
-
-## Error Types
-
-### Retryable Errors
- Network errors
- Connection timeouts
- Server errors (5xx)
- CORS errors (in some cases)
-
-### Non-Retryable Errors
- Client errors (4xx)
- Authentication errors
- Rate limit errors
- Invalid request errors
-
-## Best Practices
-
-1. **For Users**:
-   - Wait for queued requests to complete before closing the app
-   - Use the manual retry button if automatic recovery fails
-   - Check the connection status indicator for current state
-
-2. **For Developers**:
-   - All fetch requests should go through RequestQueueManager
-   - Use appropriate request priorities
-   - Handle both online and offline scenarios in UI
-   - Provide clear feedback about connection status
-
-## Technical Implementation
-
-### Key Components
-
-1. **ConnectionManager** (`connectionManager.ts`):
-   - Monitors connection state
-   - Implements retry logic with exponential backoff
-   - Provides connection state subscriptions
-
-2. **RequestQueueManager** (`requestQueue.ts`):
-   - Queues failed requests
-   - Integrates with ConnectionManager
-   - Handles request prioritization
-
-3. **ConnectionUI** (`connectionUI.ts`):
-   - Displays connection status
-   - Shows offline banner
-   - Updates queue information
-
-### Integration Example
-
-```typescript
-// Automatic integration through RequestQueueManager
-const queue = RequestQueueManager.getInstance();
-const data = await queue.enqueue<ResponseType>(
-    'translate',  // Request type
-    async () => {
-        // Your fetch request
-        const response = await fetch('/api/translate', options);
-        return response.json();
-    },
-    5  // Priority (1-10, higher = more important)
-);
-```
-
-## Troubleshooting
-
-### Connection Not Detected
- Check browser permissions for network status
- Ensure health endpoint (/health) is accessible
- Verify no firewall/proxy blocking
-
-### Requests Not Retrying
- Check browser console for errors
- Verify request type is retryable
- Check if max retries exceeded
-
-### Queue Not Processing
- Manually trigger retry with button
- Check if requests are timing out
- Verify server is responding
-
-## Future Enhancements
-
- Persistent queue storage (survive page refresh)
- Configurable retry strategies per request type
- Network speed detection and adaptation
- Progressive web app offline mode
--- a/CORS_CONFIG.md
+++ b/CORS_CONFIG.md
@@ -1,152 +0,0 @@
-# CORS Configuration Guide
-
-This document explains how to configure Cross-Origin Resource Sharing (CORS) for the Talk2Me application.
-
-## Overview
-
-CORS is configured using Flask-CORS to enable secure cross-origin usage of the API endpoints. This allows the Talk2Me application to be embedded in other websites or accessed from different domains while maintaining security.
-
-## Environment Variables
-
-### `CORS_ORIGINS`
-
-Controls which domains are allowed to access the API endpoints.
-
- **Default**: `*` (allows all origins - use only for development)
- **Production Example**: `https://yourdomain.com,https://app.yourdomain.com`
- **Format**: Comma-separated list of allowed origins
-
-```bash
-# Development (allows all origins)
-export CORS_ORIGINS="*"
-
-# Production (restrict to specific domains)
-export CORS_ORIGINS="https://talk2me.example.com,https://app.example.com"
-```
-
-### `ADMIN_CORS_ORIGINS`
-
-Controls which domains can access admin endpoints (more restrictive).
-
- **Default**: `http://localhost:*` (allows all localhost ports)
- **Production Example**: `https://admin.yourdomain.com`
- **Format**: Comma-separated list of allowed admin origins
-
-```bash
-# Development
-export ADMIN_CORS_ORIGINS="http://localhost:*"
-
-# Production
-export ADMIN_CORS_ORIGINS="https://admin.talk2me.example.com"
-```
-
-## Configuration Details
-
-The CORS configuration includes:
-
- **Allowed Methods**: GET, POST, OPTIONS
- **Allowed Headers**: Content-Type, Authorization, X-Requested-With, X-Admin-Token
- **Exposed Headers**: Content-Range, X-Content-Range
- **Credentials Support**: Enabled (supports cookies and authorization headers)
- **Max Age**: 3600 seconds (preflight requests cached for 1 hour)
-
-## Endpoints
-
-All endpoints have CORS enabled with the following configuration:
-
-### Regular API Endpoints
- `/api/*`
- `/transcribe`
- `/translate`
- `/translate/stream`
- `/speak`
- `/get_audio/*`
- `/check_tts_server`
- `/update_tts_config`
- `/health/*`
-
-### Admin Endpoints (More Restrictive)
- `/admin/*` - Uses `ADMIN_CORS_ORIGINS` instead of general `CORS_ORIGINS`
-
-## Security Best Practices
-
-1. **Never use `*` in production** - Always specify exact allowed origins
-2. **Use HTTPS** - Always use HTTPS URLs in production CORS origins
-3. **Separate admin origins** - Keep admin endpoints on a separate, more restrictive origin list
-4. **Review regularly** - Periodically review and update allowed origins
-
-## Example Configurations
-
-### Local Development
-```bash
-export CORS_ORIGINS="*"
-export ADMIN_CORS_ORIGINS="http://localhost:*"
-```
-
-### Staging Environment
-```bash
-export CORS_ORIGINS="https://staging.talk2me.com,https://staging-app.talk2me.com"
-export ADMIN_CORS_ORIGINS="https://staging-admin.talk2me.com"
-```
-
-### Production Environment
-```bash
-export CORS_ORIGINS="https://talk2me.com,https://app.talk2me.com"
-export ADMIN_CORS_ORIGINS="https://admin.talk2me.com"
-```
-
-### Mobile App Integration
-```bash
-# Include mobile app schemes if needed
-export CORS_ORIGINS="https://talk2me.com,https://app.talk2me.com,capacitor://localhost,ionic://localhost"
-```
-
-## Testing CORS Configuration
-
-You can test CORS configuration using curl:
-
-```bash
-# Test preflight request
-curl -X OPTIONS https://your-api.com/api/transcribe \
-  -H "Origin: https://allowed-origin.com" \
-  -H "Access-Control-Request-Method: POST" \
-  -H "Access-Control-Request-Headers: Content-Type" \
-  -v
-
-# Test actual request
-curl -X POST https://your-api.com/api/transcribe \
-  -H "Origin: https://allowed-origin.com" \
-  -H "Content-Type: application/json" \
-  -d '{"test": "data"}' \
-  -v
-```
-
-## Troubleshooting
-
-### CORS Errors in Browser Console
-
-If you see CORS errors:
-
-1. Check that the origin is included in `CORS_ORIGINS`
-2. Ensure the URL protocol matches (http vs https)
-3. Check for trailing slashes in origins
-4. Verify environment variables are set correctly
-
-### Common Issues
-
-1. **"No 'Access-Control-Allow-Origin' header"**
-   - Origin not in allowed list
-   - Check `CORS_ORIGINS` environment variable
-
-2. **"CORS policy: The request client is not a secure context"**
-   - Using HTTP instead of HTTPS
-   - Update to use HTTPS in production
-
-3. **"CORS policy: Credentials flag is true, but Access-Control-Allow-Credentials is not 'true'"**
-   - This should not occur with current configuration
-   - Check that `supports_credentials` is True in CORS config
-
-## Additional Resources
-
- [MDN CORS Documentation](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS)
- [Flask-CORS Documentation](https://flask-cors.readthedocs.io/)
--- a/ERROR_LOGGING.md
+++ b/ERROR_LOGGING.md
@@ -1,460 +0,0 @@
-# Error Logging Documentation
-
-This document describes the comprehensive error logging system implemented in Talk2Me for debugging production issues.
-
-## Overview
-
-Talk2Me implements a structured logging system that provides:
- JSON-formatted structured logs for easy parsing
- Multiple log streams (app, errors, access, security, performance)
- Automatic log rotation to prevent disk space issues
- Request tracing with unique IDs
- Performance metrics collection
- Security event tracking
- Error deduplication and frequency tracking
-
-## Log Types
-
-### 1. Application Logs (`logs/talk2me.log`)
-General application logs including info, warnings, and debug messages.
-
-```json
-{
-  "timestamp": "2024-01-15T10:30:45.123Z",
-  "level": "INFO",
-  "logger": "talk2me",
-  "message": "Whisper model loaded successfully",
-  "app": "talk2me",
-  "environment": "production",
-  "hostname": "server-1",
-  "thread": "MainThread",
-  "process": 12345
-}
-```
-
-### 2. Error Logs (`logs/errors.log`)
-Dedicated error logging with full exception details and stack traces.
-
-```json
-{
-  "timestamp": "2024-01-15T10:31:00.456Z",
-  "level": "ERROR",
-  "logger": "talk2me.errors",
-  "message": "Error in transcribe: File too large",
-  "exception": {
-    "type": "ValueError",
-    "message": "Audio file exceeds maximum size",
-    "traceback": ["...full stack trace..."]
-  },
-  "request_id": "1234567890-abcdef",
-  "endpoint": "transcribe",
-  "method": "POST",
-  "path": "/transcribe",
-  "ip": "192.168.1.100"
-}
-```
-
-### 3. Access Logs (`logs/access.log`)
-HTTP request/response logging for traffic analysis.
-
-```json
-{
-  "timestamp": "2024-01-15T10:32:00.789Z",
-  "level": "INFO",
-  "message": "request_complete",
-  "request_id": "1234567890-abcdef",
-  "method": "POST",
-  "path": "/transcribe",
-  "status": 200,
-  "duration_ms": 1250,
-  "content_length": 4096,
-  "ip": "192.168.1.100",
-  "user_agent": "Mozilla/5.0..."
-}
-```
-
-### 4. Security Logs (`logs/security.log`)
-Security-related events and suspicious activities.
-
-```json
-{
-  "timestamp": "2024-01-15T10:33:00.123Z",
-  "level": "WARNING",
-  "message": "Security event: rate_limit_exceeded",
-  "event": "rate_limit_exceeded",
-  "severity": "warning",
-  "ip": "192.168.1.100",
-  "endpoint": "/transcribe",
-  "attempts": 15,
-  "blocked": true
-}
-```
-
-### 5. Performance Logs (`logs/performance.log`)
-Performance metrics and slow request tracking.
-
-```json
-{
-  "timestamp": "2024-01-15T10:34:00.456Z",
-  "level": "INFO",
-  "message": "Performance metric: transcribe_audio",
-  "metric": "transcribe_audio",
-  "duration_ms": 2500,
-  "function": "transcribe",
-  "module": "app",
-  "request_id": "1234567890-abcdef"
-}
-```
-
-## Configuration
-
-### Environment Variables
-
-```bash
-# Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
-export LOG_LEVEL=INFO
-
-# Log file paths
-export LOG_FILE=logs/talk2me.log
-export ERROR_LOG_FILE=logs/errors.log
-
-# Log rotation settings
-export LOG_MAX_BYTES=52428800      # 50MB
-export LOG_BACKUP_COUNT=10         # Keep 10 backup files
-
-# Environment
-export FLASK_ENV=production
-```
-
-### Flask Configuration
-
-```python
-app.config.update({
-    'LOG_LEVEL': 'INFO',
-    'LOG_FILE': 'logs/talk2me.log',
-    'ERROR_LOG_FILE': 'logs/errors.log',
-    'LOG_MAX_BYTES': 50 * 1024 * 1024,
-    'LOG_BACKUP_COUNT': 10
-})
-```
-
-## Admin API Endpoints
-
-### GET /admin/logs/errors
-View recent error logs and error frequency statistics.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/errors
-```
-
-Response:
-```json
-{
-  "error_summary": {
-    "abc123def456": {
-      "count_last_hour": 5,
-      "last_seen": 1705320000
-    }
-  },
-  "recent_errors": [...],
-  "total_errors_logged": 150
-}
-```
-
-### GET /admin/logs/performance
-View performance metrics and slow requests.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/performance
-```
-
-Response:
-```json
-{
-  "performance_metrics": {
-    "transcribe_audio": {
-      "avg_ms": 850.5,
-      "max_ms": 3200,
-      "min_ms": 125,
-      "count": 1024
-    }
-  },
-  "slow_requests": [
-    {
-      "metric": "transcribe_audio",
-      "duration_ms": 3200,
-      "timestamp": "2024-01-15T10:35:00Z"
-    }
-  ]
-}
-```
-
-### GET /admin/logs/security
-View security events and suspicious activities.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/logs/security
-```
-
-Response:
-```json
-{
-  "security_events": [...],
-  "event_summary": {
-    "rate_limit_exceeded": 25,
-    "suspicious_error": 3,
-    "high_error_rate": 1
-  },
-  "total_events": 29
-}
-```
-
-## Usage Patterns
-
-### 1. Logging Errors with Context
-
-```python
-from error_logger import log_exception
-
-try:
-    # Some operation
-    process_audio(file)
-except Exception as e:
-    log_exception(
-        e,
-        message="Failed to process audio",
-        user_id=user.id,
-        file_size=file.size,
-        file_type=file.content_type
-    )
-```
-
-### 2. Performance Monitoring
-
-```python
-from error_logger import log_performance
-
-@log_performance('expensive_operation')
-def process_large_file(file):
-    # This will automatically log execution time
-    return processed_data
-```
-
-### 3. Security Event Logging
-
-```python
-app.error_logger.log_security(
-    'unauthorized_access',
-    severity='warning',
-    ip=request.remote_addr,
-    attempted_resource='/admin',
-    user_agent=request.headers.get('User-Agent')
-)
-```
-
-### 4. Request Context
-
-```python
-from error_logger import log_context
-
-with log_context(user_id=user.id, feature='translation'):
-    # All logs within this context will include user_id and feature
-    translate_text(text)
-```
-
-## Log Analysis
-
-### Finding Specific Errors
-
-```bash
-# Find all authentication errors
-grep '"error_type":"AuthenticationError"' logs/errors.log | jq .
-
-# Find errors from specific IP
-grep '"ip":"192.168.1.100"' logs/errors.log | jq .
-
-# Find errors in last hour
-grep "$(date -u -d '1 hour ago' +%Y-%m-%dT%H)" logs/errors.log | jq .
-```
-
-### Performance Analysis
-
-```bash
-# Find slow requests (>2000ms)
-jq 'select(.extra_fields.duration_ms > 2000)' logs/performance.log
-
-# Calculate average response time for endpoint
-jq 'select(.extra_fields.metric == "transcribe_audio") | .extra_fields.duration_ms' logs/performance.log | awk '{sum+=$1; count++} END {print sum/count}'
-```
-
-### Security Monitoring
-
-```bash
-# Count security events by type
-jq '.extra_fields.event' logs/security.log | sort | uniq -c
-
-# Find all blocked IPs
-jq 'select(.extra_fields.blocked == true) | .extra_fields.ip' logs/security.log | sort -u
-```
-
-## Log Rotation
-
-Logs are automatically rotated based on size or time:
-
- **Application/Error logs**: Rotate at 50MB, keep 10 backups
- **Access logs**: Daily rotation, keep 30 days
- **Performance logs**: Hourly rotation, keep 7 days
- **Security logs**: Rotate at 50MB, keep 10 backups
-
-Rotated logs are named with numeric suffixes:
- `talk2me.log` (current)
- `talk2me.log.1` (most recent backup)
- `talk2me.log.2` (older backup)
- etc.
-
-## Best Practices
-
-### 1. Structured Logging
-
-Always include relevant context:
-```python
-logger.info("User action completed", extra={
-    'extra_fields': {
-        'user_id': user.id,
-        'action': 'upload_audio',
-        'file_size': file.size,
-        'duration_ms': processing_time
-    }
-})
-```
-
-### 2. Error Handling
-
-Log errors at appropriate levels:
-```python
-try:
-    result = risky_operation()
-except ValidationError as e:
-    logger.warning(f"Validation failed: {e}")  # Expected errors
-except Exception as e:
-    logger.error(f"Unexpected error: {e}", exc_info=True)  # Unexpected errors
-```
-
-### 3. Performance Tracking
-
-Track key operations:
-```python
-start = time.time()
-result = expensive_operation()
-duration = (time.time() - start) * 1000
-
-app.error_logger.log_performance(
-    'expensive_operation',
-    value=duration,
-    input_size=len(data),
-    output_size=len(result)
-)
-```
-
-### 4. Security Awareness
-
-Log security-relevant events:
-```python
-if failed_attempts > 3:
-    app.error_logger.log_security(
-        'multiple_failed_attempts',
-        severity='warning',
-        ip=request.remote_addr,
-        attempts=failed_attempts
-    )
-```
-
-## Monitoring Integration
-
-### Prometheus Metrics
-
-Export log metrics for Prometheus:
-```python
-@app.route('/metrics')
-def prometheus_metrics():
-    error_summary = app.error_logger.get_error_summary()
-    # Format as Prometheus metrics
-    return format_prometheus_metrics(error_summary)
-```
-
-### ELK Stack
-
-Ship logs to Elasticsearch:
-```yaml
-filebeat.inputs:
- type: log
-  paths:
-    - /app/logs/*.log
-  json.keys_under_root: true
-  json.add_error_key: true
-```
-
-### CloudWatch
-
-For AWS deployments:
-```python
-# Install boto3 and watchtower
-import watchtower
-cloudwatch_handler = watchtower.CloudWatchLogHandler()
-logger.addHandler(cloudwatch_handler)
-```
-
-## Troubleshooting
-
-### Common Issues
-
-#### 1. Logs Not Being Written
-
-Check permissions:
-```bash
-ls -la logs/
-# Should show write permissions for app user
-```
-
-Create logs directory:
-```bash
-mkdir -p logs
-chmod 755 logs
-```
-
-#### 2. Disk Space Issues
-
-Monitor log sizes:
-```bash
-du -sh logs/*
-```
-
-Force rotation:
-```bash
-# Manually rotate logs
-mv logs/talk2me.log logs/talk2me.log.backup
-# App will create new log file
-```
-
-#### 3. Performance Impact
-
-If logging impacts performance:
- Increase LOG_LEVEL to WARNING or ERROR
- Reduce backup count
- Use asynchronous logging (future enhancement)
-
-## Security Considerations
-
-1. **Log Sanitization**: Sensitive data is automatically masked
-2. **Access Control**: Admin endpoints require authentication
-3. **Log Retention**: Old logs are automatically deleted
-4. **Encryption**: Consider encrypting logs at rest in production
-5. **Audit Trail**: All log access is itself logged
-
-## Future Enhancements
-
-1. **Centralized Logging**: Ship logs to centralized service
-2. **Real-time Alerts**: Trigger alerts on error patterns
-3. **Log Analytics**: Built-in log analysis dashboard
-4. **Correlation IDs**: Track requests across microservices
-5. **Async Logging**: Reduce performance impact
--- a/GPU_SUPPORT.md
+++ b/GPU_SUPPORT.md
@@ -1,68 +0,0 @@
-# GPU Support for Talk2Me
-
-## Current GPU Support Status
-
-### ✅ NVIDIA GPUs (Full Support)
- **Requirements**: CUDA 11.x or 12.x
- **Optimizations**:
-  - TensorFloat-32 (TF32) for Ampere GPUs (RTX 30xx, A100)
-  - cuDNN auto-tuning
-  - Half-precision (FP16) inference
-  - CUDA kernel pre-caching
-  - Memory pre-allocation
-
-### ⚠️ AMD GPUs (Limited Support)
- **Requirements**: ROCm 5.x installation
- **Status**: Falls back to CPU unless ROCm is properly configured
- **To enable AMD GPU**:
-  ```bash
-  # Install PyTorch with ROCm support
-  pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
-  ```
- **Limitations**:
-  - No cuDNN optimizations
-  - May have compatibility issues
-  - Performance varies by GPU model
-
-### ✅ Apple Silicon (M1/M2/M3)
- **Requirements**: macOS 12.3+
- **Status**: Uses Metal Performance Shaders (MPS)
- **Optimizations**:
-  - Native Metal acceleration
-  - Unified memory architecture benefits
-  - No FP16 (not well supported on MPS yet)
-
-### 📊 Performance Comparison
-
-| GPU Type | First Transcription | Subsequent | Notes |
-|----------|-------------------|------------|-------|
-| NVIDIA RTX 3080 | ~2s | ~0.5s | Full optimizations |
-| AMD RX 6800 XT | ~3-4s | ~1-2s | With ROCm |
-| Apple M2 | ~2.5s | ~1s | MPS acceleration |
-| CPU (i7-12700K) | ~5-10s | ~5-10s | No acceleration |
-
-## Checking Your GPU Status
-
-Run the app and check the logs:
-```
-INFO: NVIDIA GPU detected - using CUDA acceleration
-INFO: GPU memory allocated: 542.00 MB
-INFO: Whisper model loaded and optimized for NVIDIA GPU
-```
-
-## Troubleshooting
-
-### AMD GPU Not Detected
-1. Install ROCm-compatible PyTorch
-2. Set environment variable: `export HSA_OVERRIDE_GFX_VERSION=10.3.0`
-3. Check with: `rocm-smi`
-
-### NVIDIA GPU Not Used
-1. Check CUDA installation: `nvidia-smi`
-2. Verify PyTorch CUDA: `python -c "import torch; print(torch.cuda.is_available())"`
-3. Install CUDA toolkit if needed
-
-### Apple Silicon Not Accelerated
-1. Update macOS to 12.3+
-2. Update PyTorch: `pip install --upgrade torch`
-3. Check MPS: `python -c "import torch; print(torch.backends.mps.is_available())"`
--- a/MEMORY_MANAGEMENT.md
+++ b/MEMORY_MANAGEMENT.md
@@ -1,285 +0,0 @@
-# Memory Management Documentation
-
-This document describes the comprehensive memory management system implemented in Talk2Me to prevent memory leaks and crashes after extended use.
-
-## Overview
-
-Talk2Me implements a dual-layer memory management system:
-1. **Backend (Python)**: Manages GPU memory, Whisper model, and temporary files
-2. **Frontend (JavaScript)**: Manages audio blobs, object URLs, and Web Audio contexts
-
-## Memory Leak Issues Addressed
-
-### Backend Memory Leaks
-
-1. **GPU Memory Fragmentation**
-   - Whisper model accumulates GPU memory over time
-   - Solution: Periodic GPU cache clearing and model reloading
-
-2. **Temporary File Accumulation**
-   - Audio files not cleaned up quickly enough under load
-   - Solution: Aggressive cleanup with tracking and periodic sweeps
-
-3. **Session Resource Leaks**
-   - Long-lived sessions accumulate resources
-   - Solution: Integration with session manager for resource limits
-
-### Frontend Memory Leaks
-
-1. **Audio Blob Leaks**
-   - MediaRecorder chunks kept in memory
-   - Solution: SafeMediaRecorder wrapper with automatic cleanup
-
-2. **Object URL Leaks**
-   - URLs created but not revoked
-   - Solution: Centralized tracking and automatic revocation
-
-3. **AudioContext Leaks**
-   - Contexts created but never closed
-   - Solution: MemoryManager tracks and closes contexts
-
-4. **MediaStream Leaks**
-   - Microphone streams not properly stopped
-   - Solution: Automatic track stopping and stream cleanup
-
-## Backend Memory Management
-
-### MemoryManager Class
-
-The `MemoryManager` monitors and manages memory usage:
-
-```python
-memory_manager = MemoryManager(app, {
-    'memory_threshold_mb': 4096,      # 4GB process memory limit
-    'gpu_memory_threshold_mb': 2048,  # 2GB GPU memory limit
-    'cleanup_interval': 30            # Check every 30 seconds
-})
-```
-
-### Features
-
-1. **Automatic Monitoring**
-   - Background thread checks memory usage
-   - Triggers cleanup when thresholds exceeded
-   - Logs statistics every 5 minutes
-
-2. **GPU Memory Management**
-   - Clears CUDA cache after each operation
-   - Reloads Whisper model if fragmentation detected
-   - Tracks reload count and timing
-
-3. **Temporary File Cleanup**
-   - Tracks all temporary files
-   - Age-based cleanup (5 minutes normal, 1 minute aggressive)
-   - Cleanup on process exit
-
-4. **Context Managers**
-   ```python
-   with AudioProcessingContext(memory_manager) as ctx:
-       # Process audio
-       ctx.add_temp_file(temp_path)
-       # Files automatically cleaned up
-   ```
-
-### Admin Endpoints
-
- `GET /admin/memory` - View current memory statistics
- `POST /admin/memory/cleanup` - Trigger manual cleanup
-
-## Frontend Memory Management
-
-### MemoryManager Class
-
-Centralized tracking of all browser resources:
-
-```typescript
-const memoryManager = MemoryManager.getInstance();
-
-// Register resources
-memoryManager.registerAudioContext(context);
-memoryManager.registerObjectURL(url);
-memoryManager.registerMediaStream(stream);
-```
-
-### SafeMediaRecorder
-
-Wrapper for MediaRecorder with automatic cleanup:
-
-```typescript
-const recorder = new SafeMediaRecorder();
-await recorder.start(constraints);
-// Recording...
-const blob = await recorder.stop(); // Automatically cleans up
-```
-
-### AudioBlobHandler
-
-Safe handling of audio blobs and object URLs:
-
-```typescript
-const handler = new AudioBlobHandler(blob);
-const url = handler.getObjectURL(); // Tracked automatically
-// Use URL...
-handler.cleanup(); // Revokes URL and clears references
-```
-
-## Memory Thresholds
-
-### Backend Thresholds
-
-| Resource | Default Limit | Configurable Via |
-|----------|--------------|------------------|
-| Process Memory | 4096 MB | MEMORY_THRESHOLD_MB |
-| GPU Memory | 2048 MB | GPU_MEMORY_THRESHOLD_MB |
-| Temp File Age | 300 seconds | Built-in |
-| Model Reload Interval | 300 seconds | Built-in |
-
-### Frontend Thresholds
-
-| Resource | Cleanup Trigger |
-|----------|----------------|
-| Closed AudioContexts | Every 30 seconds |
-| Stopped MediaStreams | Every 30 seconds |
-| Orphaned Object URLs | On navigation/unload |
-
-## Best Practices
-
-### Backend
-
-1. **Use Context Managers**
-   ```python
-   @with_memory_management
-   def process_audio():
-       # Automatic cleanup
-   ```
-
-2. **Register Temporary Files**
-   ```python
-   register_temp_file(path)
-   ctx.add_temp_file(path)
-   ```
-
-3. **Clear GPU Memory**
-   ```python
-   torch.cuda.empty_cache()
-   torch.cuda.synchronize()
-   ```
-
-### Frontend
-
-1. **Use Safe Wrappers**
-   ```typescript
-   // Don't use raw MediaRecorder
-   const recorder = new SafeMediaRecorder();
-   ```
-
-2. **Clean Up Handlers**
-   ```typescript
-   if (audioHandler) {
-       audioHandler.cleanup();
-   }
-   ```
-
-3. **Register All Resources**
-   ```typescript
-   const context = new AudioContext();
-   memoryManager.registerAudioContext(context);
-   ```
-
-## Monitoring
-
-### Backend Monitoring
-
-```bash
-# View memory stats
-curl -H "X-Admin-Token: token" http://localhost:5005/admin/memory
-
-# Response
-{
-  "memory": {
-    "process_mb": 850.5,
-    "system_percent": 45.2,
-    "gpu_mb": 1250.0,
-    "gpu_percent": 61.0
-  },
-  "temp_files": {
-    "count": 5,
-    "size_mb": 12.5
-  },
-  "model": {
-    "reload_count": 2,
-    "last_reload": "2024-01-15T10:30:00"
-  }
-}
-```
-
-### Frontend Monitoring
-
-```javascript
-// Get memory stats
-const stats = memoryManager.getStats();
-console.log('Active contexts:', stats.audioContexts);
-console.log('Object URLs:', stats.objectURLs);
-```
-
-## Troubleshooting
-
-### High Memory Usage
-
-1. **Check Current Usage**
-   ```bash
-   curl -H "X-Admin-Token: token" http://localhost:5005/admin/memory
-   ```
-
-2. **Trigger Manual Cleanup**
-   ```bash
-   curl -X POST -H "X-Admin-Token: token" \
-     http://localhost:5005/admin/memory/cleanup
-   ```
-
-3. **Check Logs**
-   ```bash
-   grep "Memory" logs/talk2me.log
-   grep "GPU memory" logs/talk2me.log
-   ```
-
-### Memory Leak Symptoms
-
-1. **Backend**
-   - Process memory continuously increasing
-   - GPU memory not returning to baseline
-   - Temp files accumulating in upload folder
-   - Slower transcription over time
-
-2. **Frontend**
-   - Browser tab memory increasing
-   - Page becoming unresponsive
-   - Audio playback issues
-   - Console errors about contexts
-
-### Debug Mode
-
-Enable debug logging:
-```python
-# Backend
-app.config['DEBUG_MEMORY'] = True
-
-# Frontend (in console)
-localStorage.setItem('DEBUG_MEMORY', 'true');
-```
-
-## Performance Impact
-
-Memory management adds minimal overhead:
- Backend: ~30ms per cleanup cycle
- Frontend: <5ms per resource registration
- Cleanup operations are non-blocking
- Model reloading takes ~2-3 seconds (rare)
-
-## Future Enhancements
-
-1. **Predictive Cleanup**: Clean resources based on usage patterns
-2. **Memory Pooling**: Reuse audio buffers and contexts
-3. **Distributed Memory**: Share memory stats across instances
-4. **Alert System**: Notify admins of memory issues
-5. **Auto-scaling**: Scale resources based on memory pressure
--- a/PRODUCTION_DEPLOYMENT.md
+++ b/PRODUCTION_DEPLOYMENT.md
@@ -1,435 +0,0 @@
-# Production Deployment Guide
-
-This guide covers deploying Talk2Me in a production environment using a proper WSGI server.
-
-## Overview
-
-The Flask development server is not suitable for production use. This guide covers:
- Gunicorn as the WSGI server
- Nginx as a reverse proxy
- Docker for containerization
- Systemd for process management
- Security best practices
-
-## Quick Start with Docker
-
-### 1. Using Docker Compose
-
-```bash
-# Clone the repository
-git clone https://github.com/your-repo/talk2me.git
-cd talk2me
-
-# Create .env file with production settings
-cat > .env <<EOF
-TTS_API_KEY=your-api-key
-ADMIN_TOKEN=your-secure-admin-token
-SECRET_KEY=your-secure-secret-key
-POSTGRES_PASSWORD=your-secure-db-password
-EOF
-
-# Build and start services
-docker-compose up -d
-
-# Check status
-docker-compose ps
-docker-compose logs -f talk2me
-```
-
-### 2. Using Docker (standalone)
-
-```bash
-# Build the image
-docker build -t talk2me .
-
-# Run the container
-docker run -d \
-  --name talk2me \
-  -p 5005:5005 \
-  -e TTS_API_KEY=your-api-key \
-  -e ADMIN_TOKEN=your-secure-token \
-  -e SECRET_KEY=your-secure-key \
-  -v $(pwd)/logs:/app/logs \
-  talk2me
-```
-
-## Manual Deployment
-
-### 1. System Requirements
-
- Ubuntu 20.04+ or similar Linux distribution
- Python 3.8+
- Nginx
- Systemd
- 4GB+ RAM recommended
- GPU (optional, for faster transcription)
-
-### 2. Installation
-
-Run the deployment script as root:
-
-```bash
-sudo ./deploy.sh
-```
-
-Or manually:
-
-```bash
-# Install system dependencies
-sudo apt-get update
-sudo apt-get install -y python3-pip python3-venv nginx
-
-# Create application user
-sudo useradd -m -s /bin/bash talk2me
-
-# Create directories
-sudo mkdir -p /opt/talk2me /var/log/talk2me
-sudo chown talk2me:talk2me /opt/talk2me /var/log/talk2me
-
-# Copy application files
-sudo cp -r . /opt/talk2me/
-sudo chown -R talk2me:talk2me /opt/talk2me
-
-# Install Python dependencies
-sudo -u talk2me python3 -m venv /opt/talk2me/venv
-sudo -u talk2me /opt/talk2me/venv/bin/pip install -r requirements-prod.txt
-
-# Configure and start services
-sudo cp talk2me.service /etc/systemd/system/
-sudo systemctl enable talk2me
-sudo systemctl start talk2me
-```
-
-## Gunicorn Configuration
-
-The `gunicorn_config.py` file contains production-ready settings:
-
-### Worker Configuration
-
-```python
-# Number of worker processes
-workers = multiprocessing.cpu_count() * 2 + 1
-
-# Worker timeout (increased for audio processing)
-timeout = 120
-
-# Restart workers periodically to prevent memory leaks
-max_requests = 1000
-max_requests_jitter = 50
-```
-
-### Performance Tuning
-
-For different workloads:
-
-```bash
-# CPU-bound (transcription heavy)
-export GUNICORN_WORKERS=8
-export GUNICORN_THREADS=1
-
-# I/O-bound (many concurrent requests)
-export GUNICORN_WORKERS=4
-export GUNICORN_THREADS=4
-export GUNICORN_WORKER_CLASS=gthread
-
-# Async (best concurrency)
-export GUNICORN_WORKER_CLASS=gevent
-export GUNICORN_WORKER_CONNECTIONS=1000
-```
-
-## Nginx Configuration
-
-### Basic Setup
-
-The provided `nginx.conf` includes:
- Reverse proxy to Gunicorn
- Static file serving
- WebSocket support
- Security headers
- Gzip compression
-
-### SSL/TLS Setup
-
-```nginx
-server {
-    listen 443 ssl http2;
-    server_name your-domain.com;
-    
-    ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem;
-    ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem;
-    
-    # Strong SSL configuration
-    ssl_protocols TLSv1.2 TLSv1.3;
-    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
-    ssl_prefer_server_ciphers off;
-    
-    # HSTS
-    add_header Strict-Transport-Security "max-age=63072000" always;
-}
-```
-
-## Environment Variables
-
-### Required
-
-```bash
-# Security
-SECRET_KEY=your-very-secure-secret-key
-ADMIN_TOKEN=your-admin-api-token
-
-# TTS Configuration
-TTS_API_KEY=your-tts-api-key
-TTS_SERVER_URL=http://your-tts-server:5050/v1/audio/speech
-
-# Flask
-FLASK_ENV=production
-```
-
-### Optional
-
-```bash
-# Performance
-GUNICORN_WORKERS=4
-GUNICORN_THREADS=2
-MEMORY_THRESHOLD_MB=4096
-GPU_MEMORY_THRESHOLD_MB=2048
-
-# Database (for session storage)
-DATABASE_URL=postgresql://user:pass@localhost/talk2me
-REDIS_URL=redis://localhost:6379/0
-
-# Monitoring
-SENTRY_DSN=your-sentry-dsn
-```
-
-## Monitoring
-
-### Health Checks
-
-```bash
-# Basic health check
-curl http://localhost:5005/health
-
-# Detailed health check
-curl http://localhost:5005/health/detailed
-
-# Memory usage
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/memory
-```
-
-### Logs
-
-```bash
-# Application logs
-tail -f /var/log/talk2me/talk2me.log
-
-# Error logs
-tail -f /var/log/talk2me/errors.log
-
-# Gunicorn logs
-journalctl -u talk2me -f
-
-# Nginx logs
-tail -f /var/log/nginx/access.log
-tail -f /var/log/nginx/error.log
-```
-
-### Metrics
-
-With Prometheus client installed:
-
-```bash
-# Prometheus metrics endpoint
-curl http://localhost:5005/metrics
-```
-
-## Scaling
-
-### Horizontal Scaling
-
-For multiple servers:
-
-1. Use Redis for session storage
-2. Use PostgreSQL for persistent data
-3. Load balance with Nginx:
-
-```nginx
-upstream talk2me_backends {
-    least_conn;
-    server server1:5005 weight=1;
-    server server2:5005 weight=1;
-    server server3:5005 weight=1;
-}
-```
-
-### Vertical Scaling
-
-Adjust based on load:
-
-```bash
-# High memory usage
-MEMORY_THRESHOLD_MB=8192
-GPU_MEMORY_THRESHOLD_MB=4096
-
-# More workers
-GUNICORN_WORKERS=16
-GUNICORN_THREADS=4
-
-# Larger file limits
-client_max_body_size 100M;
-```
-
-## Security
-
-### Firewall
-
-```bash
-# Allow only necessary ports
-sudo ufw allow 80/tcp
-sudo ufw allow 443/tcp
-sudo ufw allow 22/tcp
-sudo ufw enable
-```
-
-### File Permissions
-
-```bash
-# Secure file permissions
-sudo chmod 750 /opt/talk2me
-sudo chmod 640 /opt/talk2me/.env
-sudo chmod 755 /opt/talk2me/static
-```
-
-### AppArmor/SELinux
-
-Create security profiles to restrict application access.
-
-## Backup
-
-### Database Backup
-
-```bash
-# PostgreSQL
-pg_dump talk2me > backup.sql
-
-# Redis
-redis-cli BGSAVE
-```
-
-### Application Backup
-
-```bash
-# Backup application and logs
-tar -czf talk2me-backup.tar.gz \
-  /opt/talk2me \
-  /var/log/talk2me \
-  /etc/systemd/system/talk2me.service \
-  /etc/nginx/sites-available/talk2me
-```
-
-## Troubleshooting
-
-### Service Won't Start
-
-```bash
-# Check service status
-systemctl status talk2me
-
-# Check logs
-journalctl -u talk2me -n 100
-
-# Test configuration
-sudo -u talk2me /opt/talk2me/venv/bin/gunicorn --check-config wsgi:application
-```
-
-### High Memory Usage
-
-```bash
-# Trigger cleanup
-curl -X POST -H "X-Admin-Token: token" http://localhost:5005/admin/memory/cleanup
-
-# Restart workers
-systemctl reload talk2me
-```
-
-### Slow Response Times
-
-1. Check worker count
-2. Enable async workers
-3. Check GPU availability
-4. Review nginx buffering settings
-
-## Performance Optimization
-
-### 1. Enable GPU
-
-Ensure CUDA/ROCm is properly installed:
-
-```bash
-# Check GPU
-nvidia-smi  # or rocm-smi
-
-# Set in environment
-export CUDA_VISIBLE_DEVICES=0
-```
-
-### 2. Optimize Workers
-
-```python
-# For CPU-heavy workloads
-workers = cpu_count()
-threads = 1
-
-# For I/O-heavy workloads
-workers = cpu_count() * 2
-threads = 4
-```
-
-### 3. Enable Caching
-
-Use Redis for caching translations:
-
-```python
-CACHE_TYPE = 'redis'
-CACHE_REDIS_URL = 'redis://localhost:6379/0'
-```
-
-## Maintenance
-
-### Regular Tasks
-
-1. **Log Rotation**: Configured automatically
-2. **Database Cleanup**: Run weekly
-3. **Model Updates**: Check for Whisper updates
-4. **Security Updates**: Keep dependencies updated
-
-### Update Procedure
-
-```bash
-# Backup first
-./backup.sh
-
-# Update code
-git pull
-
-# Update dependencies
-sudo -u talk2me /opt/talk2me/venv/bin/pip install -r requirements-prod.txt
-
-# Restart service
-sudo systemctl restart talk2me
-```
-
-## Rollback
-
-If deployment fails:
-
-```bash
-# Stop service
-sudo systemctl stop talk2me
-
-# Restore backup
-tar -xzf talk2me-backup.tar.gz -C /
-
-# Restart service
-sudo systemctl start talk2me
-```
--- a/RATE_LIMITING.md
+++ b/RATE_LIMITING.md
@@ -1,235 +0,0 @@
-# Rate Limiting Documentation
-
-This document describes the rate limiting implementation in Talk2Me to protect against DoS attacks and resource exhaustion.
-
-## Overview
-
-Talk2Me implements a comprehensive rate limiting system with:
- Token bucket algorithm with sliding window
- Per-endpoint configurable limits
- IP-based blocking (temporary and permanent)
- Global request limits
- Concurrent request throttling
- Request size validation
-
-## Rate Limits by Endpoint
-
-### Transcription (`/transcribe`)
- **Per Minute**: 10 requests
- **Per Hour**: 100 requests
- **Burst Size**: 3 requests
- **Max Request Size**: 10MB
- **Token Refresh**: 1 token per 6 seconds
-
-### Translation (`/translate`)
- **Per Minute**: 20 requests
- **Per Hour**: 300 requests
- **Burst Size**: 5 requests
- **Max Request Size**: 100KB
- **Token Refresh**: 1 token per 3 seconds
-
-### Streaming Translation (`/translate/stream`)
- **Per Minute**: 10 requests
- **Per Hour**: 150 requests
- **Burst Size**: 3 requests
- **Max Request Size**: 100KB
- **Token Refresh**: 1 token per 6 seconds
-
-### Text-to-Speech (`/speak`)
- **Per Minute**: 15 requests
- **Per Hour**: 200 requests
- **Burst Size**: 3 requests
- **Max Request Size**: 50KB
- **Token Refresh**: 1 token per 4 seconds
-
-### API Endpoints
- Push notifications, error logging: Various limits (see code)
-
-## Global Limits
-
- **Total Requests Per Minute**: 1,000 (across all endpoints)
- **Total Requests Per Hour**: 10,000
- **Concurrent Requests**: 50 maximum
-
-## Rate Limiting Headers
-
-Successful responses include:
-```
-X-RateLimit-Limit: 20
-X-RateLimit-Remaining: 15
-X-RateLimit-Reset: 1234567890
-```
-
-Rate limited responses (429) include:
-```
-X-RateLimit-Limit: 20
-X-RateLimit-Remaining: 0
-X-RateLimit-Reset: 1234567890
-Retry-After: 60
-```
-
-## Client Identification
-
-Clients are identified by:
- IP address (including X-Forwarded-For support)
- User-Agent string
- Combined hash for uniqueness
-
-## Automatic Blocking
-
-IPs are temporarily blocked for 1 hour if:
- They exceed 100 requests per minute
- They repeatedly hit rate limits
- They exhibit suspicious patterns
-
-## Configuration
-
-### Environment Variables
-
-```bash
-# No direct environment variables for rate limiting
-# Configured in code - can be extended to use env vars
-```
-
-### Programmatic Configuration
-
-Rate limits can be adjusted in `rate_limiter.py`:
-
-```python
-self.endpoint_limits = {
-    '/transcribe': {
-        'requests_per_minute': 10,
-        'requests_per_hour': 100,
-        'burst_size': 3,
-        'token_refresh_rate': 0.167,
-        'max_request_size': 10 * 1024 * 1024  # 10MB
-    }
-}
-```
-
-## Admin Endpoints
-
-### Get Rate Limit Configuration
-```bash
-curl -H "X-Admin-Token: your-admin-token" \
-  http://localhost:5005/admin/rate-limits
-```
-
-### Get Rate Limit Statistics
-```bash
-# Global stats
-curl -H "X-Admin-Token: your-admin-token" \
-  http://localhost:5005/admin/rate-limits/stats
-
-# Client-specific stats
-curl -H "X-Admin-Token: your-admin-token" \
-  http://localhost:5005/admin/rate-limits/stats?client_id=abc123
-```
-
-### Block IP Address
-```bash
-# Temporary block (1 hour)
-curl -X POST -H "X-Admin-Token: your-admin-token" \
-  -H "Content-Type: application/json" \
-  -d '{"ip": "192.168.1.100", "duration": 3600}' \
-  http://localhost:5005/admin/block-ip
-
-# Permanent block
-curl -X POST -H "X-Admin-Token: your-admin-token" \
-  -H "Content-Type: application/json" \
-  -d '{"ip": "192.168.1.100", "permanent": true}' \
-  http://localhost:5005/admin/block-ip
-```
-
-## Algorithm Details
-
-### Token Bucket
- Each client gets a bucket with configurable burst size
- Tokens regenerate at a fixed rate
- Requests consume tokens
- Empty bucket = request denied
-
-### Sliding Window
- Tracks requests in the last minute and hour
- More accurate than fixed windows
- Prevents gaming the system at window boundaries
-
-## Best Practices
-
-### For Users
-1. Implement exponential backoff when receiving 429 errors
-2. Check rate limit headers to avoid hitting limits
-3. Cache responses when possible
-4. Use bulk operations where available
-
-### For Administrators
-1. Monitor rate limit statistics regularly
-2. Adjust limits based on usage patterns
-3. Use IP blocking sparingly
-4. Set up alerts for suspicious activity
-
-## Error Responses
-
-### Rate Limited (429)
-```json
-{
-  "error": "Rate limit exceeded (per minute)",
-  "retry_after": 60
-}
-```
-
-### Request Too Large (413)
-```json
-{
-  "error": "Request too large"
-}
-```
-
-### IP Blocked (429)
-```json
-{
-  "error": "IP temporarily blocked due to excessive requests"
-}
-```
-
-## Monitoring
-
-Key metrics to monitor:
- Rate limit hits by endpoint
- Blocked IPs
- Concurrent request peaks
- Request size violations
- Global limit approaches
-
-## Performance Impact
-
- Minimal overhead (~1-2ms per request)
- Memory usage scales with active clients
- Automatic cleanup of old buckets
- Thread-safe implementation
-
-## Security Considerations
-
-1. **DoS Protection**: Prevents resource exhaustion
-2. **Burst Control**: Limits sudden traffic spikes
-3. **Size Validation**: Prevents large payload attacks
-4. **IP Blocking**: Stops persistent attackers
-5. **Global Limits**: Protects overall system capacity
-
-## Troubleshooting
-
-### "Rate limit exceeded" errors
- Check client request patterns
- Verify time synchronization
- Look for retry loops
- Check IP blocking status
-
-### Memory usage increasing
- Verify cleanup thread is running
- Check for client ID explosion
- Monitor bucket count
-
-### Legitimate users blocked
- Review rate limit settings
- Check for shared IP issues
- Implement IP whitelisting if needed
--- a/README.md
+++ b/README.md
@@ -1,9 +1,30 @@
-# Voice Language Translator
+# Talk2Me - Real-Time Voice Language Translator

-A mobile-friendly web application that translates spoken language between multiple languages using:
- Gemma 3 open-source LLM via Ollama for translation
- OpenAI Whisper for speech-to-text
- OpenAI Edge TTS for text-to-speech
+A production-ready, mobile-friendly web application that provides real-time translation of spoken language between multiple languages.
+
+## Features
+
+- **Real-time Speech Recognition**: Powered by OpenAI Whisper with GPU acceleration
+- **Advanced Translation**: Using Gemma 3 open-source LLM via Ollama
+- **Natural Text-to-Speech**: OpenAI Edge TTS for lifelike voice output
+- **Progressive Web App**: Full offline support with service workers
+- **Multi-Speaker Support**: Track and translate conversations with multiple participants
+- **Enterprise Security**: Comprehensive rate limiting, session management, and encrypted secrets
+- **Production Ready**: Docker support, load balancing, and extensive monitoring
+
+## Table of Contents
+
+- [Supported Languages](#supported-languages)
+- [Quick Start](#quick-start)
+- [Installation](#installation)
+- [Configuration](#configuration)
+- [Security Features](#security-features)
+- [Production Deployment](#production-deployment)
+- [API Documentation](#api-documentation)
+- [Development](#development)
+- [Monitoring & Operations](#monitoring--operations)
+- [Troubleshooting](#troubleshooting)
+- [Contributing](#contributing)

 ## Supported Languages

@@ -22,68 +43,135 @@ A mobile-friendly web application that translates spoken language between multip
 - Turkish
 - Uzbek

-## Setup Instructions
+## Quick Start

-1. Install the required Python packages:
-   ```
+```bash
+# Clone the repository
+git clone https://github.com/yourusername/talk2me.git
+cd talk2me
+
+# Install dependencies
+pip install -r requirements.txt
+npm install
+
+# Initialize secure configuration
+python manage_secrets.py init
+python manage_secrets.py set TTS_API_KEY your-api-key-here
+
+# Ensure Ollama is running with Gemma
+ollama pull gemma2:9b
+ollama pull gemma3:27b
+
+# Start the application
+python app.py
+```
+
+Open your browser and navigate to `http://localhost:5005`
+
+## Installation
+
+### Prerequisites
+
+- Python 3.8+
+- Node.js 14+
+- Ollama (for LLM translation)
+- OpenAI Edge TTS server
+- Optional: NVIDIA GPU with CUDA, AMD GPU with ROCm, or Apple Silicon
+
+### Detailed Setup
+
+1. **Install Python dependencies**:
+   ```bash
+   python -m venv venv
+   source venv/bin/activate  # On Windows: venv\Scripts\activate
   pip install -r requirements.txt
   ```

-2. Configure secrets and environment:
+2. **Install Node.js dependencies**:
   ```bash
-   # Initialize secure secrets management
-   python manage_secrets.py init
-   
-   # Set required secrets
-   python manage_secrets.py set TTS_API_KEY
-   
-   # Or use traditional .env file
-   cp .env.example .env
-   nano .env
+   npm install
+   npm run build  # Build TypeScript files
   ```

-   **⚠️ Security Note**: Talk2Me includes encrypted secrets management. See [SECURITY.md](SECURITY.md) and [SECRETS_MANAGEMENT.md](SECRETS_MANAGEMENT.md) for details.
+3. **Configure GPU Support** (Optional):
+   ```bash
+   # For NVIDIA GPUs
+   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
   
-3. Make sure you have Ollama installed and the Gemma 3 model loaded:
-   ```
-   ollama pull gemma3
+   # For AMD GPUs (ROCm)
+   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
+   
+   # For Apple Silicon
+   pip install torch torchvision torchaudio
   ```

-4. Ensure your OpenAI Edge TTS server is running on port 5050.
+4. **Set up Ollama**:
+   ```bash
+   # Install Ollama (https://ollama.ai)
+   curl -fsSL https://ollama.ai/install.sh | sh
   
-5. Run the application:
-   ```
-   python app.py
+   # Pull required models
+   ollama pull gemma2:9b    # Faster, for streaming
+   ollama pull gemma3:27b   # Better quality
   ```

-6. Open your browser and navigate to:
-   ```
-   http://localhost:8000
-   ```
+5. **Configure TTS Server**:
+   Ensure your OpenAI Edge TTS server is running. Default expected at `http://localhost:5050`

-## Usage
+## Configuration

-1. Select your source language from the dropdown menu
-2. Press the microphone button and speak
-3. Press the button again to stop recording
-4. Wait for the transcription to complete
-5. Select your target language
-6. Press the "Translate" button
-7. Use the play buttons to hear the original or translated text
+### Environment Variables

-## Technical Details
+Talk2Me uses encrypted secrets management for sensitive configuration. You can use either the secure secrets system or traditional environment variables.

- The app uses Flask for the web server
- Audio is processed client-side using the MediaRecorder API
- Whisper for speech recognition with language hints
- Ollama provides access to the Gemma 3 model for translation
- OpenAI Edge TTS delivers natural-sounding speech output
+#### Using Secure Secrets Management (Recommended)

-## CORS Configuration
+```bash
+# Initialize the secrets system
+python manage_secrets.py init

-The application supports Cross-Origin Resource Sharing (CORS) for secure cross-origin usage. See [CORS_CONFIG.md](CORS_CONFIG.md) for detailed configuration instructions.
+# Set required secrets
+python manage_secrets.py set TTS_API_KEY
+python manage_secrets.py set TTS_SERVER_URL
+python manage_secrets.py set ADMIN_TOKEN
+
+# List all secrets
+python manage_secrets.py list
+
+# Rotate encryption keys
+python manage_secrets.py rotate
+```
+
+#### Using Environment Variables
+
+Create a `.env` file:
+
+```env
+# Core Configuration
+TTS_API_KEY=your-api-key-here
+TTS_SERVER_URL=http://localhost:5050/v1/audio/speech
+ADMIN_TOKEN=your-secure-admin-token
+
+# CORS Configuration
+CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com
+ADMIN_CORS_ORIGINS=https://admin.yourdomain.com
+
+# Security Settings
+SECRET_KEY=your-secret-key-here
+MAX_CONTENT_LENGTH=52428800  # 50MB
+SESSION_LIFETIME=3600  # 1 hour
+RATE_LIMIT_STORAGE_URL=redis://localhost:6379/0
+
+# Performance Tuning
+WHISPER_MODEL_SIZE=base
+GPU_MEMORY_THRESHOLD_MB=2048
+MEMORY_CLEANUP_INTERVAL=30
+```
+
+### Advanced Configuration
+
+#### CORS Settings

-Quick setup:
 ```bash
 # Development (allow all origins)
 export CORS_ORIGINS="*"
@@ -93,88 +181,638 @@ export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
 export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com"
 ```

-## Connection Retry & Offline Support
+#### Rate Limiting

-Talk2Me handles network interruptions gracefully with automatic retry logic:
- Automatic request queuing during connection loss
- Exponential backoff retry with configurable parameters
- Visual connection status indicators
- Priority-based request processing
+Configure per-endpoint rate limits:

-See [CONNECTION_RETRY.md](CONNECTION_RETRY.md) for detailed documentation.
+```python
+# In your config or via admin API
+RATE_LIMITS = {
+    'default': {'requests_per_minute': 30, 'requests_per_hour': 500},
+    'transcribe': {'requests_per_minute': 10, 'requests_per_hour': 100},
+    'translate': {'requests_per_minute': 20, 'requests_per_hour': 300}
+}
+```

-## Rate Limiting
+#### Session Management

-Comprehensive rate limiting protects against DoS attacks and resource exhaustion:
+```python
+SESSION_CONFIG = {
+    'max_file_size_mb': 100,
+    'max_files_per_session': 100,
+    'idle_timeout_minutes': 15,
+    'max_lifetime_minutes': 60
+}
+```
+
+## Security Features
+
+### 1. Rate Limiting
+
+Comprehensive DoS protection with:
 - Token bucket algorithm with sliding window
 - Per-endpoint configurable limits
 - Automatic IP blocking for abusive clients
- Global request limits and concurrent request throttling
 - Request size validation

-See [RATE_LIMITING.md](RATE_LIMITING.md) for detailed documentation.
+```bash
+# Check rate limit status
+curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/rate-limits

-## Session Management
+# Block an IP
+curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"ip": "192.168.1.100", "duration": 3600}' \
+  http://localhost:5005/admin/block-ip
+```

-Advanced session management prevents resource leaks from abandoned sessions:
- Automatic tracking of all session resources (audio files, temp files)
- Per-session resource limits (100 files, 100MB)
- Automatic cleanup of idle sessions (15 minutes) and expired sessions (1 hour)
- Real-time monitoring and metrics
- Manual cleanup capabilities for administrators
+### 2. Secrets Management

-See [SESSION_MANAGEMENT.md](SESSION_MANAGEMENT.md) for detailed documentation.
+- AES-128 encryption for sensitive data
+- Automatic key rotation
+- Audit logging
+- Platform-specific secure storage

-## Request Size Limits
+```bash
+# View audit log
+python manage_secrets.py audit

-Comprehensive request size limiting prevents memory exhaustion:
- Global limit: 50MB for any request
- Audio files: 25MB maximum
- JSON payloads: 1MB maximum
- File type detection and enforcement
- Dynamic configuration via admin API
+# Backup secrets
+python manage_secrets.py export --output backup.enc

-See [REQUEST_SIZE_LIMITS.md](REQUEST_SIZE_LIMITS.md) for detailed documentation.
+# Restore from backup
+python manage_secrets.py import --input backup.enc
+```

-## Error Logging
+### 3. Session Management

-Production-ready error logging system for debugging and monitoring:
- Structured JSON logs for easy parsing
- Multiple log streams (app, errors, access, security, performance)
- Automatic log rotation to prevent disk exhaustion
- Request tracing with unique IDs
- Performance metrics and slow request tracking
- Admin endpoints for log analysis
+- Automatic resource tracking
+- Per-session limits (100 files, 100MB)
+- Idle session cleanup (15 minutes)
+- Real-time monitoring

-See [ERROR_LOGGING.md](ERROR_LOGGING.md) for detailed documentation.
+```bash
+# View active sessions
+curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/sessions

-## Memory Management
+# Clean up specific session
+curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
+  http://localhost:5005/admin/sessions/SESSION_ID/cleanup
+```

-Comprehensive memory leak prevention for extended use:
- GPU memory management with automatic cleanup
- Whisper model reloading to prevent fragmentation
- Frontend resource tracking (audio blobs, contexts, streams)
- Automatic cleanup of temporary files
- Memory monitoring and manual cleanup endpoints
+### 4. Request Size Limits

-See [MEMORY_MANAGEMENT.md](MEMORY_MANAGEMENT.md) for detailed documentation.
+- Global limit: 50MB
+- Audio files: 25MB
+- JSON payloads: 1MB
+- Dynamic configuration
+
+```bash
+# Update size limits
+curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"max_audio_size": "30MB"}' \
+  http://localhost:5005/admin/size-limits
+```

 ## Production Deployment

-For production use, deploy with a proper WSGI server:
- Gunicorn with optimized worker configuration
- Nginx reverse proxy with caching
- Docker/Docker Compose support
- Systemd service management
- Comprehensive security hardening
+### Docker Deployment

-Quick start:
 ```bash
+# Build and run with Docker Compose (CPU only)
 docker-compose up -d
+
+# With NVIDIA GPU support
+docker-compose -f docker-compose.yml -f docker-compose.nvidia.yml up -d
+
+# With AMD GPU support (ROCm)
+docker-compose -f docker-compose.yml -f docker-compose.amd.yml up -d
+
+# With Apple Silicon support
+docker-compose -f docker-compose.yml -f docker-compose.apple.yml up -d
+
+# Scale web workers
+docker-compose up -d --scale talk2me=4
+
+# View logs
+docker-compose logs -f talk2me
 ```

-See [PRODUCTION_DEPLOYMENT.md](PRODUCTION_DEPLOYMENT.md) for detailed deployment instructions.
+### Docker Compose Configuration

-## Mobile Support
+Choose the appropriate configuration based on your GPU:

-The interface is fully responsive and designed to work well on mobile devices.
+#### NVIDIA GPU Configuration
+
+```yaml
+version: '3.8'
+services:
+  web:
+    build: .
+    ports:
+      - "5005:5005"
+    environment:
+      - GUNICORN_WORKERS=4
+      - GUNICORN_THREADS=2
+    volumes:
+      - ./logs:/app/logs
+      - whisper-cache:/root/.cache/whisper
+    deploy:
+      resources:
+        limits:
+          memory: 4G
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+```
+
+#### AMD GPU Configuration (ROCm)
+
+```yaml
+version: '3.8'
+services:
+  web:
+    build: .
+    ports:
+      - "5005:5005"
+    environment:
+      - GUNICORN_WORKERS=4
+      - GUNICORN_THREADS=2
+      - HSA_OVERRIDE_GFX_VERSION=10.3.0  # Adjust for your GPU
+    volumes:
+      - ./logs:/app/logs
+      - whisper-cache:/root/.cache/whisper
+      - /dev/kfd:/dev/kfd  # ROCm KFD interface
+      - /dev/dri:/dev/dri  # Direct Rendering Interface
+    devices:
+      - /dev/kfd
+      - /dev/dri
+    group_add:
+      - video
+      - render
+    deploy:
+      resources:
+        limits:
+          memory: 4G
+```
+
+#### Apple Silicon Configuration
+
+```yaml
+version: '3.8'
+services:
+  web:
+    build: .
+    platform: linux/arm64/v8  # For M1/M2 Macs
+    ports:
+      - "5005:5005"
+    environment:
+      - GUNICORN_WORKERS=4
+      - GUNICORN_THREADS=2
+      - PYTORCH_ENABLE_MPS_FALLBACK=1  # Enable MPS fallback
+    volumes:
+      - ./logs:/app/logs
+      - whisper-cache:/root/.cache/whisper
+    deploy:
+      resources:
+        limits:
+          memory: 4G
+```
+
+#### CPU-Only Configuration
+
+```yaml
+version: '3.8'
+services:
+  web:
+    build: .
+    ports:
+      - "5005:5005"
+    environment:
+      - GUNICORN_WORKERS=4
+      - GUNICORN_THREADS=2
+      - OMP_NUM_THREADS=4  # OpenMP threads for CPU
+    volumes:
+      - ./logs:/app/logs
+      - whisper-cache:/root/.cache/whisper
+    deploy:
+      resources:
+        limits:
+          memory: 4G
+          cpus: '4.0'
+```
+
+### Nginx Configuration
+
+```nginx
+upstream talk2me {
+    least_conn;
+    server web1:5005 weight=1 max_fails=3 fail_timeout=30s;
+    server web2:5005 weight=1 max_fails=3 fail_timeout=30s;
+}
+
+server {
+    listen 443 ssl http2;
+    server_name talk2me.yourdomain.com;
+    
+    ssl_certificate /etc/ssl/certs/talk2me.crt;
+    ssl_certificate_key /etc/ssl/private/talk2me.key;
+    
+    client_max_body_size 50M;
+    
+    location / {
+        proxy_pass http://talk2me;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header Host $host;
+        
+        # WebSocket support
+        proxy_http_version 1.1;
+        proxy_set_header Upgrade $http_upgrade;
+        proxy_set_header Connection "upgrade";
+    }
+    
+    # Cache static assets
+    location /static/ {
+        alias /app/static/;
+        expires 30d;
+        add_header Cache-Control "public, immutable";
+    }
+}
+```
+
+### Systemd Service
+
+```ini
+[Unit]
+Description=Talk2Me Translation Service
+After=network.target
+
+[Service]
+Type=notify
+User=talk2me
+Group=talk2me
+WorkingDirectory=/opt/talk2me
+Environment="PATH=/opt/talk2me/venv/bin"
+ExecStart=/opt/talk2me/venv/bin/gunicorn \
+    --config gunicorn_config.py \
+    --bind 0.0.0.0:5005 \
+    app:app
+Restart=always
+RestartSec=10
+
+[Install]
+WantedBy=multi-user.target
+```
+
+## API Documentation
+
+### Core Endpoints
+
+#### Transcribe Audio
+```http
+POST /transcribe
+Content-Type: multipart/form-data
+
+audio: (binary)
+source_lang: auto|language_code
+```
+
+#### Translate Text
+```http
+POST /translate
+Content-Type: application/json
+
+{
+  "text": "Hello world",
+  "source_lang": "English",
+  "target_lang": "Spanish"
+}
+```
+
+#### Streaming Translation
+```http
+POST /translate/stream
+Content-Type: application/json
+
+{
+  "text": "Long text to translate",
+  "source_lang": "auto",
+  "target_lang": "French"
+}
+
+Response: Server-Sent Events stream
+```
+
+#### Text-to-Speech
+```http
+POST /speak
+Content-Type: application/json
+
+{
+  "text": "Hola mundo",
+  "language": "Spanish"
+}
+```
+
+### Admin Endpoints
+
+All admin endpoints require `X-Admin-Token` header.
+
+#### Health & Monitoring
+- `GET /health` - Basic health check
+- `GET /health/detailed` - Component status
+- `GET /metrics` - Prometheus metrics
+- `GET /admin/memory` - Memory usage stats
+
+#### Session Management
+- `GET /admin/sessions` - List active sessions
+- `GET /admin/sessions/:id` - Session details
+- `POST /admin/sessions/:id/cleanup` - Manual cleanup
+
+#### Security Controls
+- `GET /admin/rate-limits` - View rate limits
+- `POST /admin/block-ip` - Block IP address
+- `GET /admin/logs/security` - Security events
+
+## Development
+
+### TypeScript Development
+
+```bash
+# Install dependencies
+npm install
+
+# Development mode with auto-compilation
+npm run dev
+
+# Build for production
+npm run build
+
+# Type checking
+npm run typecheck
+```
+
+### Project Structure
+
+```
+talk2me/
+├── app.py                 # Main Flask application
+├── config.py             # Configuration management
+├── requirements.txt      # Python dependencies
+├── package.json         # Node.js dependencies
+├── tsconfig.json        # TypeScript configuration
+├── gunicorn_config.py   # Production server config
+├── docker-compose.yml   # Container orchestration
+├── static/
+│   ├── js/
+│   │   ├── src/        # TypeScript source files
+│   │   └── dist/       # Compiled JavaScript
+│   ├── css/            # Stylesheets
+│   └── icons/          # PWA icons
+├── templates/          # HTML templates
+├── logs/              # Application logs
+└── tests/             # Test suite
+```
+
+### Key Components
+
+1. **Connection Management** (`connectionManager.ts`)
+   - Automatic retry with exponential backoff
+   - Request queuing during offline periods
+   - Connection status monitoring
+
+2. **Translation Cache** (`translationCache.ts`)
+   - IndexedDB for offline support
+   - LRU eviction policy
+   - Automatic cache size management
+
+3. **Speaker Management** (`speakerManager.ts`)
+   - Multi-speaker conversation tracking
+   - Speaker-specific audio handling
+   - Conversation export functionality
+
+4. **Error Handling** (`errorBoundary.ts`)
+   - Global error catching
+   - Automatic error reporting
+   - User-friendly error messages
+
+### Running Tests
+
+```bash
+# Python tests
+pytest tests/ -v
+
+# TypeScript tests
+npm test
+
+# Integration tests
+python test_integration.py
+```
+
+## Monitoring & Operations
+
+### Logging System
+
+Talk2Me uses structured JSON logging with multiple streams:
+
+```bash
+logs/
+├── talk2me.log      # General application log
+├── errors.log       # Error-specific log
+├── access.log       # HTTP access log
+├── security.log     # Security events
+└── performance.log  # Performance metrics
+```
+
+View logs:
+```bash
+# Recent errors
+tail -f logs/errors.log | jq '.'
+
+# Security events
+grep "rate_limit_exceeded" logs/security.log | jq '.'
+
+# Slow requests
+jq 'select(.extra_fields.duration_ms > 1000)' logs/performance.log
+```
+
+### Memory Management
+
+Talk2Me includes comprehensive memory leak prevention:
+
+1. **Backend Memory Management**
+   - GPU memory monitoring
+   - Automatic model reloading
+   - Temporary file cleanup
+
+2. **Frontend Memory Management**
+   - Audio blob cleanup
+   - WebRTC resource management
+   - Event listener cleanup
+
+Monitor memory:
+```bash
+# Check memory stats
+curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/memory
+
+# Trigger manual cleanup
+curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
+  http://localhost:5005/admin/memory/cleanup
+```
+
+### Performance Tuning
+
+#### GPU Optimization
+
+```python
+# config.py or environment
+GPU_OPTIMIZATIONS = {
+    'enabled': True,
+    'fp16': True,           # Half precision for 2x speedup
+    'batch_size': 1,        # Adjust based on GPU memory
+    'num_workers': 2,       # Parallel data loading
+    'pin_memory': True      # Faster GPU transfer
+}
+```
+
+#### Whisper Optimization
+
+```python
+TRANSCRIBE_OPTIONS = {
+    'beam_size': 1,         # Faster inference
+    'best_of': 1,           # Disable multiple attempts
+    'temperature': 0,       # Deterministic output
+    'compression_ratio_threshold': 2.4,
+    'logprob_threshold': -1.0,
+    'no_speech_threshold': 0.6
+}
+```
+
+### Scaling Considerations
+
+1. **Horizontal Scaling**
+   - Use Redis for shared rate limiting
+   - Configure sticky sessions for WebSocket
+   - Share audio files via object storage
+
+2. **Vertical Scaling**
+   - Increase worker processes
+   - Tune thread pool size
+   - Allocate more GPU memory
+
+3. **Caching Strategy**
+   - Cache translations in Redis
+   - Use CDN for static assets
+   - Enable HTTP caching headers
+
+## Troubleshooting
+
+### Common Issues
+
+#### GPU Not Detected
+
+```bash
+# Check CUDA availability
+python -c "import torch; print(torch.cuda.is_available())"
+
+# Check GPU memory
+nvidia-smi
+
+# For AMD GPUs
+rocm-smi
+
+# For Apple Silicon
+python -c "import torch; print(torch.backends.mps.is_available())"
+```
+
+#### High Memory Usage
+
+```bash
+# Check for memory leaks
+curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/health/storage
+
+# Manual cleanup
+curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
+  http://localhost:5005/admin/cleanup
+```
+
+#### CORS Issues
+
+```bash
+# Test CORS configuration
+curl -X OPTIONS http://localhost:5005/api/transcribe \
+  -H "Origin: https://yourdomain.com" \
+  -H "Access-Control-Request-Method: POST"
+```
+
+#### TTS Server Connection
+
+```bash
+# Check TTS server status
+curl http://localhost:5005/check_tts_server
+
+# Update TTS configuration
+curl -X POST http://localhost:5005/update_tts_config \
+  -H "Content-Type: application/json" \
+  -d '{"server_url": "http://localhost:5050/v1/audio/speech", "api_key": "new-key"}'
+```
+
+### Debug Mode
+
+Enable debug logging:
+```bash
+export FLASK_ENV=development
+export LOG_LEVEL=DEBUG
+python app.py
+```
+
+### Performance Profiling
+
+```bash
+# Enable performance logging
+export ENABLE_PROFILING=true
+
+# View slow requests
+jq 'select(.duration_ms > 1000)' logs/performance.log
+```
+
+## Contributing
+
+We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.
+
+### Development Setup
+
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Make your changes
+4. Run tests (`pytest && npm test`)
+5. Commit your changes (`git commit -m 'Add amazing feature'`)
+6. Push to the branch (`git push origin feature/amazing-feature`)
+7. Open a Pull Request
+
+### Code Style
+
+- Python: Follow PEP 8
+- TypeScript: Use ESLint configuration
+- Commit messages: Use conventional commits
+
+## License
+
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+
+## Acknowledgments
+
+- OpenAI Whisper team for the amazing speech recognition model
+- Ollama team for making LLMs accessible
+- All contributors who have helped improve Talk2Me
+
+## Support
+
+- **Documentation**: Full docs at [docs.talk2me.app](https://docs.talk2me.app)
+- **Issues**: [GitHub Issues](https://github.com/yourusername/talk2me/issues)
+- **Discussions**: [GitHub Discussions](https://github.com/yourusername/talk2me/discussions)
+- **Security**: Please report security vulnerabilities to security@talk2me.app
--- a/README_TYPESCRIPT.md
+++ b/README_TYPESCRIPT.md
@@ -1,54 +0,0 @@
-# TypeScript Setup for Talk2Me
-
-This project now includes TypeScript support for better type safety and developer experience.
-
-## Installation
-
-1. Install Node.js dependencies:
-```bash
-npm install
-```
-
-2. Build TypeScript files:
-```bash
-npm run build
-```
-
-## Development
-
-For development with automatic recompilation:
-```bash
-npm run watch
-# or
-npm run dev
-```
-
-## Project Structure
-
- `/static/js/src/` - TypeScript source files
-  - `app.ts` - Main application logic
-  - `types.ts` - Type definitions
- `/static/js/dist/` - Compiled JavaScript files (git-ignored)
- `tsconfig.json` - TypeScript configuration
- `package.json` - Node.js dependencies and scripts
-
-## Available Scripts
-
- `npm run build` - Compile TypeScript to JavaScript
- `npm run watch` - Watch for changes and recompile
- `npm run dev` - Same as watch
- `npm run clean` - Remove compiled files
- `npm run type-check` - Type-check without compiling
-
-## Type Safety Benefits
-
-The TypeScript implementation provides:
- Compile-time type checking
- Better IDE support with autocomplete
- Explicit interface definitions for API responses
- Safer refactoring
- Self-documenting code
-
-## Next Steps
-
-After building, the compiled JavaScript will be in `/static/js/dist/app.js` and will be automatically loaded by the HTML template.
--- a/REQUEST_SIZE_LIMITS.md
+++ b/REQUEST_SIZE_LIMITS.md
@@ -1,332 +0,0 @@
-# Request Size Limits Documentation
-
-This document describes the request size limiting system implemented in Talk2Me to prevent memory exhaustion from large uploads.
-
-## Overview
-
-Talk2Me implements comprehensive request size limiting to protect against:
- Memory exhaustion from large file uploads
- Denial of Service (DoS) attacks using oversized requests
- Buffer overflow attempts
- Resource starvation from unbounded requests
-
-## Default Limits
-
-### Global Limits
- **Maximum Content Length**: 50MB - Absolute maximum for any request
- **Maximum Audio File Size**: 25MB - For audio uploads (transcription)
- **Maximum JSON Payload**: 1MB - For API requests
- **Maximum Image Size**: 10MB - For future image processing features
- **Maximum Chunk Size**: 1MB - For streaming uploads
-
-## Features
-
-### 1. Multi-Layer Protection
-
-The system implements multiple layers of size checking:
- Flask's built-in `MAX_CONTENT_LENGTH` configuration
- Pre-request validation before data is loaded into memory
- File-type specific limits
- Endpoint-specific limits
- Streaming request monitoring
-
-### 2. File Type Detection
-
-Automatic detection and enforcement based on file extensions:
- Audio files: `.wav`, `.mp3`, `.ogg`, `.webm`, `.m4a`, `.flac`, `.aac`
- Image files: `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp`
- JSON payloads: Content-Type header detection
-
-### 3. Graceful Error Handling
-
-When limits are exceeded:
- Returns 413 (Request Entity Too Large) status code
- Provides clear error messages with size information
- Includes both actual and allowed sizes
- Human-readable size formatting
-
-## Configuration
-
-### Environment Variables
-
-```bash
-# Set limits via environment variables (in bytes)
-export MAX_CONTENT_LENGTH=52428800      # 50MB
-export MAX_AUDIO_SIZE=26214400          # 25MB
-export MAX_JSON_SIZE=1048576            # 1MB
-export MAX_IMAGE_SIZE=10485760          # 10MB
-```
-
-### Flask Configuration
-
-```python
-# In config.py or app.py
-app.config.update({
-    'MAX_CONTENT_LENGTH': 50 * 1024 * 1024,    # 50MB
-    'MAX_AUDIO_SIZE': 25 * 1024 * 1024,        # 25MB
-    'MAX_JSON_SIZE': 1 * 1024 * 1024,          # 1MB
-    'MAX_IMAGE_SIZE': 10 * 1024 * 1024         # 10MB
-})
-```
-
-### Dynamic Configuration
-
-Size limits can be updated at runtime via admin API.
-
-## API Endpoints
-
-### GET /admin/size-limits
-Get current size limits.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/size-limits
-```
-
-Response:
-```json
-{
-  "limits": {
-    "max_content_length": 52428800,
-    "max_audio_size": 26214400,
-    "max_json_size": 1048576,
-    "max_image_size": 10485760
-  },
-  "limits_human": {
-    "max_content_length": "50.0MB",
-    "max_audio_size": "25.0MB",
-    "max_json_size": "1.0MB",
-    "max_image_size": "10.0MB"
-  }
-}
-```
-
-### POST /admin/size-limits
-Update size limits dynamically.
-
-```bash
-curl -X POST -H "X-Admin-Token: your-token" \
-  -H "Content-Type: application/json" \
-  -d '{"max_audio_size": "30MB", "max_json_size": 2097152}' \
-  http://localhost:5005/admin/size-limits
-```
-
-Response:
-```json
-{
-  "success": true,
-  "old_limits": {...},
-  "new_limits": {...},
-  "new_limits_human": {
-    "max_audio_size": "30.0MB",
-    "max_json_size": "2.0MB"
-  }
-}
-```
-
-## Usage Examples
-
-### 1. Endpoint-Specific Limits
-
-```python
-@app.route('/upload')
-@limit_request_size(max_size=10*1024*1024)  # 10MB limit
-def upload():
-    # Handle upload
-    pass
-
-@app.route('/upload-audio')
-@limit_request_size(max_audio_size=30*1024*1024)  # 30MB for audio
-def upload_audio():
-    # Handle audio upload
-    pass
-```
-
-### 2. Client-Side Validation
-
-```javascript
-// Check file size before upload
-const MAX_AUDIO_SIZE = 25 * 1024 * 1024; // 25MB
-
-function validateAudioFile(file) {
-    if (file.size > MAX_AUDIO_SIZE) {
-        alert(`Audio file too large. Maximum size is ${MAX_AUDIO_SIZE / 1024 / 1024}MB`);
-        return false;
-    }
-    return true;
-}
-```
-
-### 3. Chunked Uploads (Future Enhancement)
-
-```javascript
-// For files larger than limits, use chunked upload
-async function uploadLargeFile(file, chunkSize = 1024 * 1024) {
-    const chunks = Math.ceil(file.size / chunkSize);
-    
-    for (let i = 0; i < chunks; i++) {
-        const start = i * chunkSize;
-        const end = Math.min(start + chunkSize, file.size);
-        const chunk = file.slice(start, end);
-        
-        await uploadChunk(chunk, i, chunks);
-    }
-}
-```
-
-## Error Responses
-
-### 413 Request Entity Too Large
-
-When a request exceeds size limits:
-
-```json
-{
-  "error": "Request too large",
-  "max_size": 52428800,
-  "your_size": 75000000,
-  "max_size_mb": 50.0
-}
-```
-
-### File-Specific Errors
-
-For audio files:
-```json
-{
-  "error": "Audio file too large",
-  "max_size": 26214400,
-  "your_size": 35000000,
-  "max_size_mb": 25.0
-}
-```
-
-For JSON payloads:
-```json
-{
-  "error": "JSON payload too large",
-  "max_size": 1048576,
-  "your_size": 2000000,
-  "max_size_kb": 1024.0
-}
-```
-
-## Best Practices
-
-### 1. Client-Side Validation
-
-Always validate file sizes on the client side:
-```javascript
-// Add to static/js/app.js
-const SIZE_LIMITS = {
-    audio: 25 * 1024 * 1024,  // 25MB
-    json: 1 * 1024 * 1024,    // 1MB
-};
-
-function checkFileSize(file, type) {
-    const limit = SIZE_LIMITS[type];
-    if (file.size > limit) {
-        showError(`File too large. Maximum size: ${formatSize(limit)}`);
-        return false;
-    }
-    return true;
-}
-```
-
-### 2. Progressive Enhancement
-
-For better UX with large files:
- Show upload progress
- Implement resumable uploads
- Compress audio client-side when possible
- Use appropriate audio formats (WebM/Opus for smaller sizes)
-
-### 3. Server Configuration
-
-Configure your web server (Nginx/Apache) to also enforce limits:
-
-**Nginx:**
-```nginx
-client_max_body_size 50M;
-client_body_buffer_size 1M;
-```
-
-**Apache:**
-```apache
-LimitRequestBody 52428800
-```
-
-### 4. Monitoring
-
-Monitor size limit violations:
- Track 413 errors in logs
- Alert on repeated violations from same IP
- Adjust limits based on usage patterns
-
-## Security Considerations
-
-1. **Memory Protection**: Pre-flight size checks prevent loading large files into memory
-2. **DoS Prevention**: Limits prevent attackers from exhausting server resources
-3. **Bandwidth Protection**: Prevents bandwidth exhaustion from large uploads
-4. **Storage Protection**: Works with session management to limit total storage per user
-
-## Integration with Other Systems
-
-### Rate Limiting
-Size limits work in conjunction with rate limiting:
- Large requests count more against rate limits
- Repeated size violations can trigger IP blocking
-
-### Session Management
-Size limits are enforced per session:
- Total storage per session is limited
- Large files count against session resource limits
-
-### Monitoring
-Size limit violations are tracked in:
- Application logs
- Health check endpoints
- Admin monitoring dashboards
-
-## Troubleshooting
-
-### Common Issues
-
-#### 1. Legitimate Large Files Rejected
-
-If users need to upload larger files:
-```bash
-# Increase limit for audio files to 50MB
-curl -X POST -H "X-Admin-Token: token" \
-  -d '{"max_audio_size": "50MB"}' \
-  http://localhost:5005/admin/size-limits
-```
-
-#### 2. Chunked Transfer Encoding
-
-For requests without Content-Length header:
- The system monitors the stream
- Terminates connection if size exceeded
- May require special handling for some clients
-
-#### 3. Load Balancer Limits
-
-Ensure your load balancer also enforces appropriate limits:
- AWS ALB: Configure request size limits
- Cloudflare: Set upload size limits
- Nginx: Configure client_max_body_size
-
-## Performance Impact
-
-The size limiting system has minimal performance impact:
- Pre-flight checks are O(1) operations
- No buffering of large requests
- Early termination of oversized requests
- Efficient memory usage
-
-## Future Enhancements
-
-1. **Chunked Upload Support**: Native support for resumable uploads
-2. **Compression Detection**: Automatic handling of compressed uploads
-3. **Dynamic Limits**: Per-user or per-tier size limits
-4. **Bandwidth Throttling**: Rate limit large uploads
-5. **Storage Quotas**: Long-term storage limits per user
--- a/SECRETS_MANAGEMENT.md
+++ b/SECRETS_MANAGEMENT.md
@@ -1,411 +0,0 @@
-# Secrets Management Documentation
-
-This document describes the secure secrets management system implemented in Talk2Me.
-
-## Overview
-
-Talk2Me uses a comprehensive secrets management system that provides:
- Encrypted storage of sensitive configuration
- Secret rotation capabilities
- Audit logging
- Integrity verification
- CLI management tools
- Environment variable integration
-
-## Architecture
-
-### Components
-
-1. **SecretsManager** (`secrets_manager.py`)
-   - Handles encryption/decryption using Fernet (AES-128)
-   - Manages secret lifecycle (create, read, update, delete)
-   - Provides audit logging
-   - Supports secret rotation
-
-2. **Configuration System** (`config.py`)
-   - Integrates secrets with Flask configuration
-   - Environment-specific configurations
-   - Validation and sanitization
-
-3. **CLI Tool** (`manage_secrets.py`)
-   - Command-line interface for secret management
-   - Interactive and scriptable
-
-### Security Features
-
- **Encryption**: AES-128 encryption using cryptography.fernet
- **Key Derivation**: PBKDF2 with SHA256 (100,000 iterations)
- **Master Key**: Stored separately with restricted permissions
- **Audit Trail**: All access and modifications logged
- **Integrity Checks**: Verify secrets haven't been tampered with
-
-## Quick Start
-
-### 1. Initialize Secrets
-
-```bash
-python manage_secrets.py init
-```
-
-This will:
- Generate a master encryption key
- Create initial secrets (Flask secret key, admin token)
- Prompt for required secrets (TTS API key)
-
-### 2. Set a Secret
-
-```bash
-# Interactive (hidden input)
-python manage_secrets.py set TTS_API_KEY
-
-# Direct (be careful with shell history)
-python manage_secrets.py set TTS_API_KEY --value "your-api-key"
-
-# With metadata
-python manage_secrets.py set API_KEY --value "key" --metadata '{"service": "external-api"}'
-```
-
-### 3. List Secrets
-
-```bash
-python manage_secrets.py list
-```
-
-Output:
-```
-Key                            Created             Last Rotated         Has Value
-------------------------------------------------------------------------------------
-FLASK_SECRET_KEY              2024-01-15          2024-01-20          ✓
-TTS_API_KEY                   2024-01-15          Never               ✓
-ADMIN_TOKEN                   2024-01-15          2024-01-18          ✓
-```
-
-### 4. Rotate Secrets
-
-```bash
-# Rotate a specific secret
-python manage_secrets.py rotate ADMIN_TOKEN
-
-# Check which secrets need rotation
-python manage_secrets.py check-rotation
-
-# Schedule automatic rotation
-python manage_secrets.py schedule-rotation API_KEY 30  # Every 30 days
-```
-
-## Configuration
-
-### Environment Variables
-
-The secrets manager checks these locations in order:
-1. Encrypted secrets storage (`.secrets.json`)
-2. `SECRET_<KEY>` environment variable
-3. `<KEY>` environment variable
-4. Default value
-
-### Master Key
-
-The master encryption key is loaded from:
-1. `MASTER_KEY` environment variable
-2. `.master_key` file (default)
-3. Auto-generated if neither exists
-
-**Important**: Protect the master key!
- Set file permissions: `chmod 600 .master_key`
- Back it up securely
- Never commit to version control
-
-### Flask Integration
-
-Secrets are automatically loaded into Flask configuration:
-
-```python
-# In app.py
-from config import init_app as init_config
-from secrets_manager import init_app as init_secrets
-
-app = Flask(__name__)
-init_config(app)
-init_secrets(app)
-
-# Access secrets
-api_key = app.config['TTS_API_KEY']
-```
-
-## CLI Commands
-
-### Basic Operations
-
-```bash
-# List all secrets
-python manage_secrets.py list
-
-# Get a secret value (requires confirmation)
-python manage_secrets.py get TTS_API_KEY
-
-# Set a secret
-python manage_secrets.py set DATABASE_URL
-
-# Delete a secret
-python manage_secrets.py delete OLD_API_KEY
-
-# Rotate a secret
-python manage_secrets.py rotate ADMIN_TOKEN
-```
-
-### Advanced Operations
-
-```bash
-# Verify integrity of all secrets
-python manage_secrets.py verify
-
-# Migrate from environment variables
-python manage_secrets.py migrate
-
-# View audit log
-python manage_secrets.py audit
-python manage_secrets.py audit TTS_API_KEY --limit 50
-
-# Schedule rotation
-python manage_secrets.py schedule-rotation API_KEY 90
-```
-
-## Security Best Practices
-
-### 1. File Permissions
-
-```bash
-# Secure the secrets files
-chmod 600 .secrets.json
-chmod 600 .master_key
-```
-
-### 2. Backup Strategy
-
- Back up `.master_key` separately from `.secrets.json`
- Store backups in different secure locations
- Test restore procedures regularly
-
-### 3. Rotation Policy
-
-Recommended rotation intervals:
- API Keys: 90 days
- Admin Tokens: 30 days
- Database Passwords: 180 days
- Encryption Keys: 365 days
-
-### 4. Access Control
-
- Use environment-specific secrets
- Implement least privilege access
- Audit secret access regularly
-
-### 5. Git Security
-
-Ensure these files are in `.gitignore`:
-```
-.secrets.json
-.master_key
-secrets.db
-*.key
-```
-
-## Deployment
-
-### Development
-
-```bash
-# Use .env file for convenience
-cp .env.example .env
-# Edit .env with development values
-
-# Initialize secrets
-python manage_secrets.py init
-```
-
-### Production
-
-```bash
-# Set master key via environment
-export MASTER_KEY="your-production-master-key"
-
-# Or use a key management service
-export MASTER_KEY_FILE="/secure/path/to/master.key"
-
-# Load secrets from secure storage
-python manage_secrets.py set TTS_API_KEY --value "$TTS_API_KEY"
-python manage_secrets.py set ADMIN_TOKEN --value "$ADMIN_TOKEN"
-```
-
-### Docker
-
-```dockerfile
-# Dockerfile
-FROM python:3.9
-
-# Copy encrypted secrets (not the master key!)
-COPY .secrets.json /app/.secrets.json
-
-# Master key provided at runtime
-ENV MASTER_KEY=""
-
-# Run with:
-# docker run -e MASTER_KEY="$MASTER_KEY" myapp
-```
-
-### Kubernetes
-
-```yaml
-# secret.yaml
-apiVersion: v1
-kind: Secret
-metadata:
-  name: talk2me-master-key
-type: Opaque
-stringData:
-  master-key: "your-master-key"
-
---
-# deployment.yaml
-apiVersion: apps/v1
-kind: Deployment
-spec:
-  template:
-    spec:
-      containers:
-      - name: talk2me
-        env:
-        - name: MASTER_KEY
-          valueFrom:
-            secretKeyRef:
-              name: talk2me-master-key
-              key: master-key
-```
-
-## Troubleshooting
-
-### Lost Master Key
-
-If you lose the master key:
-1. You'll need to recreate all secrets
-2. Generate new master key: `python manage_secrets.py init`
-3. Re-enter all secret values
-
-### Corrupted Secrets File
-
-```bash
-# Check integrity
-python manage_secrets.py verify
-
-# If corrupted, restore from backup or reinitialize
-```
-
-### Permission Errors
-
-```bash
-# Fix file permissions
-chmod 600 .secrets.json .master_key
-chown $USER:$USER .secrets.json .master_key
-```
-
-## Monitoring
-
-### Audit Logs
-
-Review secret access patterns:
-```bash
-# View all audit entries
-python manage_secrets.py audit
-
-# Check specific secret
-python manage_secrets.py audit TTS_API_KEY
-
-# Export for analysis
-python manage_secrets.py audit > audit.log
-```
-
-### Rotation Monitoring
-
-```bash
-# Check rotation status
-python manage_secrets.py check-rotation
-
-# Set up cron job for automatic checks
-0 0 * * * /path/to/python /path/to/manage_secrets.py check-rotation
-```
-
-## Migration Guide
-
-### From Environment Variables
-
-```bash
-# Automatic migration
-python manage_secrets.py migrate
-
-# Manual migration
-export OLD_API_KEY="your-key"
-python manage_secrets.py set API_KEY --value "$OLD_API_KEY"
-unset OLD_API_KEY
-```
-
-### From .env Files
-
-```python
-# migrate_env.py
-from dotenv import dotenv_values
-from secrets_manager import get_secrets_manager
-
-env_values = dotenv_values('.env')
-manager = get_secrets_manager()
-
-for key, value in env_values.items():
-    if key.endswith('_KEY') or key.endswith('_TOKEN'):
-        manager.set(key, value, {'migrated_from': '.env'})
-```
-
-## API Reference
-
-### Python API
-
-```python
-from secrets_manager import get_secret, set_secret
-
-# Get a secret
-api_key = get_secret('TTS_API_KEY', default='')
-
-# Set a secret
-set_secret('NEW_API_KEY', 'value', metadata={'service': 'external'})
-
-# Advanced usage
-from secrets_manager import get_secrets_manager
-
-manager = get_secrets_manager()
-manager.rotate('API_KEY')
-manager.schedule_rotation('TOKEN', days=30)
-```
-
-### Flask CLI
-
-```bash
-# Via Flask CLI
-flask secrets-list
-flask secrets-set
-flask secrets-rotate
-flask secrets-check-rotation
-```
-
-## Security Considerations
-
-1. **Never log secret values**
-2. **Use secure random generation for new secrets**
-3. **Implement proper access controls**
-4. **Regular security audits**
-5. **Incident response plan for compromised secrets**
-
-## Future Enhancements
-
- Integration with cloud KMS (AWS, Azure, GCP)
- Hardware security module (HSM) support
- Secret sharing (Shamir's Secret Sharing)
- Time-based access controls
- Automated compliance reporting
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -1,173 +0,0 @@
-# Security Configuration Guide
-
-This document outlines security best practices for deploying Talk2Me.
-
-## Secrets Management
-
-Talk2Me includes a comprehensive secrets management system with encryption, rotation, and audit logging.
-
-### Quick Start
-
-```bash
-# Initialize secrets management
-python manage_secrets.py init
-
-# Set a secret
-python manage_secrets.py set TTS_API_KEY
-
-# List secrets
-python manage_secrets.py list
-
-# Rotate secrets
-python manage_secrets.py rotate ADMIN_TOKEN
-```
-
-See [SECRETS_MANAGEMENT.md](SECRETS_MANAGEMENT.md) for detailed documentation.
-
-## Environment Variables
-
-**NEVER commit sensitive information like API keys, passwords, or secrets to version control.**
-
-### Required Security Configuration
-
-1. **TTS_API_KEY**
-   - Required for TTS server authentication
-   - Set via environment variable: `export TTS_API_KEY="your-api-key"`
-   - Or use a `.env` file (see `.env.example`)
-
-2. **SECRET_KEY**
-   - Required for Flask session security
-   - Generate a secure key: `python -c "import secrets; print(secrets.token_hex(32))"`
-   - Set via: `export SECRET_KEY="your-generated-key"`
-
-3. **ADMIN_TOKEN**
-   - Required for admin endpoints
-   - Generate a secure token: `python -c "import secrets; print(secrets.token_urlsafe(32))"`
-   - Set via: `export ADMIN_TOKEN="your-admin-token"`
-
-### Using a .env File (Recommended)
-
-1. Copy the example file:
-   ```bash
-   cp .env.example .env
-   ```
-
-2. Edit `.env` with your actual values:
-   ```bash
-   nano .env  # or your preferred editor
-   ```
-
-3. Load environment variables:
-   ```bash
-   # Using python-dotenv (add to requirements.txt)
-   pip install python-dotenv
-   
-   # Or source manually
-   source .env
-   ```
-
-### Python-dotenv Integration
-
-To automatically load `.env` files, add this to the top of `app.py`:
-
-```python
-from dotenv import load_dotenv
-load_dotenv()  # Load .env file if it exists
-```
-
-### Production Deployment
-
-For production deployments:
-
-1. **Use a secrets management service**:
-   - AWS Secrets Manager
-   - HashiCorp Vault
-   - Azure Key Vault
-   - Google Secret Manager
-
-2. **Set environment variables securely**:
-   - Use your platform's environment configuration
-   - Never expose secrets in logs or error messages
-   - Rotate keys regularly
-
-3. **Additional security measures**:
-   - Use HTTPS only
-   - Enable CORS restrictions
-   - Implement rate limiting
-   - Monitor for suspicious activity
-
-### Docker Deployment
-
-When using Docker:
-
-```dockerfile
-# Use build arguments for non-sensitive config
-ARG TTS_SERVER_URL=http://localhost:5050/v1/audio/speech
-
-# Use runtime environment for secrets
-ENV TTS_API_KEY=""
-```
-
-Run with:
-```bash
-docker run -e TTS_API_KEY="your-key" -e SECRET_KEY="your-secret" talk2me
-```
-
-### Kubernetes Deployment
-
-Use Kubernetes secrets:
-
-```yaml
-apiVersion: v1
-kind: Secret
-metadata:
-  name: talk2me-secrets
-type: Opaque
-stringData:
-  tts-api-key: "your-api-key"
-  flask-secret-key: "your-secret-key"
-  admin-token: "your-admin-token"
-```
-
-### Rate Limiting
-
-Talk2Me implements comprehensive rate limiting to prevent abuse:
-
-1. **Per-Endpoint Limits**:
-   - Transcription: 10/min, 100/hour
-   - Translation: 20/min, 300/hour
-   - TTS: 15/min, 200/hour
-
-2. **Global Limits**:
-   - 1,000 requests/minute total
-   - 50 concurrent requests maximum
-
-3. **Automatic Protection**:
-   - IP blocking for excessive requests
-   - Request size validation
-   - Burst control
-
-See [RATE_LIMITING.md](RATE_LIMITING.md) for configuration details.
-
-### Security Checklist
-
- [ ] All API keys removed from source code
- [ ] Environment variables configured
- [ ] `.env` file added to `.gitignore`
- [ ] Secrets rotated after any potential exposure
- [ ] HTTPS enabled in production
- [ ] CORS properly configured
- [ ] Rate limiting enabled and configured
- [ ] Admin endpoints protected with authentication
- [ ] Error messages don't expose sensitive info
- [ ] Logs sanitized of sensitive data
- [ ] Request size limits enforced
- [ ] IP blocking configured for abuse prevention
-
-### Reporting Security Issues
-
-If you discover a security vulnerability, please report it to:
- Create a private security advisory on GitHub
- Or email: security@yourdomain.com
-
-Do not create public issues for security vulnerabilities.
--- a/SESSION_MANAGEMENT.md
+++ b/SESSION_MANAGEMENT.md
@@ -1,366 +0,0 @@
-# Session Management Documentation
-
-This document describes the session management system implemented in Talk2Me to prevent resource leaks from abandoned sessions.
-
-## Overview
-
-Talk2Me implements a comprehensive session management system that tracks user sessions and associated resources (audio files, temporary files, streams) to ensure proper cleanup and prevent resource exhaustion.
-
-## Features
-
-### 1. Automatic Resource Tracking
-
-All resources created during a user session are automatically tracked:
- Audio files (uploads and generated)
- Temporary files
- Active streams
- Resource metadata (size, creation time, purpose)
-
-### 2. Resource Limits
-
-Per-session limits prevent resource exhaustion:
- Maximum resources per session: 100
- Maximum storage per session: 100MB
- Automatic cleanup of oldest resources when limits are reached
-
-### 3. Session Lifecycle Management
-
-Sessions are automatically managed:
- Created on first request
- Updated on each request
- Cleaned up when idle (15 minutes)
- Removed when expired (1 hour)
-
-### 4. Automatic Cleanup
-
-Background cleanup processes run automatically:
- Idle session cleanup (every minute)
- Expired session cleanup (every minute)
- Orphaned file cleanup (every minute)
-
-## Configuration
-
-Session management can be configured via environment variables or Flask config:
-
-```python
-# app.py or config.py
-app.config.update({
-    'MAX_SESSION_DURATION': 3600,        # 1 hour
-    'MAX_SESSION_IDLE_TIME': 900,        # 15 minutes
-    'MAX_RESOURCES_PER_SESSION': 100,
-    'MAX_BYTES_PER_SESSION': 104857600,  # 100MB
-    'SESSION_CLEANUP_INTERVAL': 60,      # 1 minute
-    'SESSION_STORAGE_PATH': '/path/to/sessions'
-})
-```
-
-## API Endpoints
-
-### Admin Endpoints
-
-All admin endpoints require authentication via `X-Admin-Token` header.
-
-#### GET /admin/sessions
-Get information about all active sessions.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions
-```
-
-Response:
-```json
-{
-  "sessions": [
-    {
-      "session_id": "uuid",
-      "user_id": null,
-      "ip_address": "192.168.1.1",
-      "created_at": "2024-01-15T10:00:00",
-      "last_activity": "2024-01-15T10:05:00",
-      "duration_seconds": 300,
-      "idle_seconds": 0,
-      "request_count": 5,
-      "resource_count": 3,
-      "total_bytes_used": 1048576,
-      "resources": [...]
-    }
-  ],
-  "stats": {
-    "total_sessions_created": 100,
-    "total_sessions_cleaned": 50,
-    "active_sessions": 5,
-    "avg_session_duration": 600,
-    "avg_resources_per_session": 4.2
-  }
-}
-```
-
-#### GET /admin/sessions/{session_id}
-Get detailed information about a specific session.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/abc123
-```
-
-#### POST /admin/sessions/{session_id}/cleanup
-Manually cleanup a specific session.
-
-```bash
-curl -X POST -H "X-Admin-Token: your-token" \
-  http://localhost:5005/admin/sessions/abc123/cleanup
-```
-
-#### GET /admin/sessions/metrics
-Get session management metrics for monitoring.
-
-```bash
-curl -H "X-Admin-Token: your-token" http://localhost:5005/admin/sessions/metrics
-```
-
-Response:
-```json
-{
-  "sessions": {
-    "active": 5,
-    "total_created": 100,
-    "total_cleaned": 95
-  },
-  "resources": {
-    "active": 20,
-    "total_cleaned": 380,
-    "active_bytes": 10485760,
-    "total_bytes_cleaned": 1073741824
-  },
-  "limits": {
-    "max_session_duration": 3600,
-    "max_idle_time": 900,
-    "max_resources_per_session": 100,
-    "max_bytes_per_session": 104857600
-  }
-}
-```
-
-## CLI Commands
-
-Session management can be controlled via Flask CLI commands:
-
-```bash
-# List all active sessions
-flask sessions-list
-
-# Manual cleanup
-flask sessions-cleanup
-
-# Show statistics
-flask sessions-stats
-```
-
-## Usage Examples
-
-### 1. Monitor Active Sessions
-
-```python
-import requests
-
-headers = {'X-Admin-Token': 'your-admin-token'}
-response = requests.get('http://localhost:5005/admin/sessions', headers=headers)
-sessions = response.json()
-
-for session in sessions['sessions']:
-    print(f"Session {session['session_id']}:")
-    print(f"  IP: {session['ip_address']}")
-    print(f"  Resources: {session['resource_count']}")
-    print(f"  Storage: {session['total_bytes_used'] / 1024 / 1024:.2f} MB")
-```
-
-### 2. Cleanup Idle Sessions
-
-```python
-# Get all sessions
-response = requests.get('http://localhost:5005/admin/sessions', headers=headers)
-sessions = response.json()['sessions']
-
-# Find idle sessions
-idle_threshold = 300  # 5 minutes
-for session in sessions:
-    if session['idle_seconds'] > idle_threshold:
-        # Cleanup idle session
-        cleanup_url = f'http://localhost:5005/admin/sessions/{session["session_id"]}/cleanup'
-        requests.post(cleanup_url, headers=headers)
-        print(f"Cleaned up idle session {session['session_id']}")
-```
-
-### 3. Monitor Resource Usage
-
-```python
-# Get metrics
-response = requests.get('http://localhost:5005/admin/sessions/metrics', headers=headers)
-metrics = response.json()
-
-print(f"Active sessions: {metrics['sessions']['active']}")
-print(f"Active resources: {metrics['resources']['active']}")
-print(f"Storage used: {metrics['resources']['active_bytes'] / 1024 / 1024:.2f} MB")
-print(f"Total cleaned: {metrics['resources']['total_bytes_cleaned'] / 1024 / 1024 / 1024:.2f} GB")
-```
-
-## Resource Types
-
-The session manager tracks different types of resources:
-
-### 1. Audio Files
- Uploaded audio files for transcription
- Generated audio files from TTS
- Automatically cleaned up after session ends
-
-### 2. Temporary Files
- Processing intermediates
- Cache files
- Automatically cleaned up after use
-
-### 3. Streams
- WebSocket connections
- Server-sent event streams
- Closed when session ends
-
-## Best Practices
-
-### 1. Session Configuration
-
-```python
-# Development
-app.config.update({
-    'MAX_SESSION_DURATION': 7200,        # 2 hours
-    'MAX_SESSION_IDLE_TIME': 1800,       # 30 minutes
-    'MAX_RESOURCES_PER_SESSION': 200,
-    'MAX_BYTES_PER_SESSION': 209715200   # 200MB
-})
-
-# Production
-app.config.update({
-    'MAX_SESSION_DURATION': 3600,        # 1 hour
-    'MAX_SESSION_IDLE_TIME': 900,        # 15 minutes
-    'MAX_RESOURCES_PER_SESSION': 100,
-    'MAX_BYTES_PER_SESSION': 104857600   # 100MB
-})
-```
-
-### 2. Monitoring
-
-Set up monitoring for:
- Number of active sessions
- Resource usage per session
- Cleanup frequency
- Failed cleanup attempts
-
-### 3. Alerting
-
-Configure alerts for:
- High number of active sessions (>1000)
- High resource usage (>80% of limits)
- Failed cleanup operations
- Orphaned files detected
-
-## Troubleshooting
-
-### Common Issues
-
-#### 1. Sessions Not Being Cleaned Up
-
-Check cleanup thread status:
-```bash
-flask sessions-stats
-```
-
-Manual cleanup:
-```bash
-flask sessions-cleanup
-```
-
-#### 2. Resource Limits Reached
-
-Check session details:
-```bash
-curl -H "X-Admin-Token: token" http://localhost:5005/admin/sessions/SESSION_ID
-```
-
-Increase limits if needed:
-```python
-app.config['MAX_RESOURCES_PER_SESSION'] = 200
-app.config['MAX_BYTES_PER_SESSION'] = 209715200  # 200MB
-```
-
-#### 3. Orphaned Files
-
-Check for orphaned files:
-```bash
-ls -la /path/to/session/storage/
-```
-
-Clean orphaned files:
-```bash
-flask sessions-cleanup
-```
-
-### Debug Logging
-
-Enable debug logging for session management:
-
-```python
-import logging
-
-# Enable session manager debug logs
-logging.getLogger('session_manager').setLevel(logging.DEBUG)
-```
-
-## Security Considerations
-
-1. **Session Hijacking**: Sessions are tied to IP addresses and user agents
-2. **Resource Exhaustion**: Strict per-session limits prevent DoS attacks
-3. **File System Access**: Session storage uses secure paths and permissions
-4. **Admin Access**: All admin endpoints require authentication
-
-## Performance Impact
-
-The session management system has minimal performance impact:
- Memory: ~1KB per session + resource metadata
- CPU: Background cleanup runs every minute
- Disk I/O: Cleanup operations are batched
- Network: No external dependencies
-
-## Integration with Other Systems
-
-### Rate Limiting
-
-Session management integrates with rate limiting:
-```python
-# Sessions are automatically tracked per IP
-# Rate limits apply per session
-```
-
-### Secrets Management
-
-Session tokens can be encrypted:
-```python
-from secrets_manager import encrypt_value
-encrypted_session = encrypt_value(session_id)
-```
-
-### Monitoring
-
-Export metrics to monitoring systems:
-```python
-# Prometheus format
-@app.route('/metrics')
-def prometheus_metrics():
-    metrics = app.session_manager.export_metrics()
-    # Format as Prometheus metrics
-    return format_prometheus(metrics)
-```
-
-## Future Enhancements
-
-1. **Session Persistence**: Store sessions in Redis/database
-2. **Distributed Sessions**: Support for multi-server deployments
-3. **Session Analytics**: Track usage patterns and trends
-4. **Resource Quotas**: Per-user resource quotas
-5. **Session Replay**: Debug issues by replaying sessions
--- a/docker-compose.amd.yml
+++ b/docker-compose.amd.yml
@@ -0,0 +1,19 @@
+version: '3.8'
+
+# Docker Compose override for AMD GPU support (ROCm)
+# Usage: docker-compose -f docker-compose.yml -f docker-compose.amd.yml up
+
+services:
+  talk2me:
+    environment:
+      - HSA_OVERRIDE_GFX_VERSION=10.3.0  # Adjust based on your GPU model
+      - ROCR_VISIBLE_DEVICES=0  # Use first GPU
+    volumes:
+      - /dev/kfd:/dev/kfd  # ROCm KFD interface
+      - /dev/dri:/dev/dri  # Direct Rendering Interface
+    devices:
+      - /dev/kfd
+      - /dev/dri
+    group_add:
+      - video  # Required for GPU access
+      - render # Required for GPU access
--- a/docker-compose.apple.yml
+++ b/docker-compose.apple.yml
@@ -0,0 +1,11 @@
+version: '3.8'
+
+# Docker Compose override for Apple Silicon
+# Usage: docker-compose -f docker-compose.yml -f docker-compose.apple.yml up
+
+services:
+  talk2me:
+    platform: linux/arm64/v8  # For M1/M2/M3 Macs
+    environment:
+      - PYTORCH_ENABLE_MPS_FALLBACK=1  # Enable Metal Performance Shaders fallback
+      - PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.7  # Memory management for MPS
--- a/docker-compose.nvidia.yml
+++ b/docker-compose.nvidia.yml
@@ -0,0 +1,16 @@
+version: '3.8'
+
+# Docker Compose override for NVIDIA GPU support
+# Usage: docker-compose -f docker-compose.yml -f docker-compose.nvidia.yml up
+
+services:
+  talk2me:
+    environment:
+      - CUDA_VISIBLE_DEVICES=0  # Use first GPU
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
--- a/setup-script.sh
+++ b/setup-script.sh
@@ -1,776 +0,0 @@
-#!/bin/bash
-
-# Create necessary directories
-mkdir -p templates static/{css,js}
-
-# Move HTML template to templates directory
-cat > templates/index.html << 'EOL'
-<!DOCTYPE html>
-<html lang="en">
-<head>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>Voice Language Translator</title>
-    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet">
-    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css">
-    <style>
-        body {
-            padding-top: 20px;
-            padding-bottom: 20px;
-            background-color: #f8f9fa;
-        }
-        .record-btn {
-            width: 80px;
-            height: 80px;
-            border-radius: 50%;
-            display: flex;
-            align-items: center;
-            justify-content: center;
-            font-size: 32px;
-            margin: 20px auto;
-            box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
-            transition: all 0.3s;
-        }
-        .record-btn:active {
-            transform: scale(0.95);
-            box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
-        }
-        .recording {
-            background-color: #dc3545 !important;
-            animation: pulse 1.5s infinite;
-        }
-        @keyframes pulse {
-            0% {
-                transform: scale(1);
-            }
-            50% {
-                transform: scale(1.05);
-            }
-            100% {
-                transform: scale(1);
-            }
-        }
-        .card {
-            border-radius: 15px;
-            box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
-            margin-bottom: 20px;
-        }
-        .card-header {
-            border-radius: 15px 15px 0 0 !important;
-        }
-        .language-select {
-            border-radius: 10px;
-            padding: 10px;
-        }
-        .text-display {
-            min-height: 100px;
-            padding: 15px;
-            background-color: #f8f9fa;
-            border-radius: 10px;
-            margin-bottom: 15px;
-        }
-        .btn-action {
-            border-radius: 10px;
-            padding: 8px 15px;
-            margin: 5px;
-        }
-        .spinner-border {
-            width: 1rem;
-            height: 1rem;
-            margin-right: 5px;
-        }
-        .status-indicator {
-            font-size: 0.9rem;
-            font-style: italic;
-            color: #6c757d;
-        }
-    </style>
-</head>
-<body>
-    <div class="container">
-        <h1 class="text-center mb-4">Voice Language Translator</h1>
-        <p class="text-center text-muted">Powered by Gemma 3, Whisper & Edge TTS</p>
-
-        <div class="row">
-            <div class="col-md-6 mb-3">
-                <div class="card">
-                    <div class="card-header bg-primary text-white">
-                        <h5 class="mb-0">Source</h5>
-                    </div>
-                    <div class="card-body">
-                        <select id="sourceLanguage" class="form-select language-select mb-3">
-                            {% for language in languages %}
-                            <option value="{{ language }}">{{ language }}</option>
-                            {% endfor %}
-                        </select>
-                        <div class="text-display" id="sourceText">
-                            <p class="text-muted">Your transcribed text will appear here...</p>
-                        </div>
-                        <div class="d-flex justify-content-between">
-                            <button id="playSource" class="btn btn-outline-primary btn-action" disabled>
-                                <i class="fas fa-play"></i> Play
-                            </button>
-                            <button id="clearSource" class="btn btn-outline-secondary btn-action">
-                                <i class="fas fa-trash"></i> Clear
-                            </button>
-                        </div>
-                    </div>
-                </div>
-            </div>
-
-            <div class="col-md-6 mb-3">
-                <div class="card">
-                    <div class="card-header bg-success text-white">
-                        <h5 class="mb-0">Translation</h5>
-                    </div>
-                    <div class="card-body">
-                        <select id="targetLanguage" class="form-select language-select mb-3">
-                            {% for language in languages %}
-                            <option value="{{ language }}">{{ language }}</option>
-                            {% endfor %}
-                        </select>
-                        <div class="text-display" id="translatedText">
-                            <p class="text-muted">Translation will appear here...</p>
-                        </div>
-                        <div class="d-flex justify-content-between">
-                            <button id="playTranslation" class="btn btn-outline-success btn-action" disabled>
-                                <i class="fas fa-play"></i> Play
-                            </button>
-                            <button id="clearTranslation" class="btn btn-outline-secondary btn-action">
-                                <i class="fas fa-trash"></i> Clear
-                            </button>
-                        </div>
-                    </div>
-                </div>
-            </div>
-        </div>
-
-        <div class="text-center">
-            <button id="recordBtn" class="btn btn-primary record-btn">
-                <i class="fas fa-microphone"></i>
-            </button>
-            <p class="status-indicator" id="statusIndicator">Click to start recording</p>
-        </div>
-
-        <div class="text-center mt-3">
-            <button id="translateBtn" class="btn btn-success" disabled>
-                <i class="fas fa-language"></i> Translate
-            </button>
-        </div>
-
-        <div class="mt-3">
-            <div class="progress d-none" id="progressContainer">
-                <div id="progressBar" class="progress-bar progress-bar-striped progress-bar-animated" role="progressbar" style="width: 0%"></div>
-            </div>
-        </div>
-
-        <audio id="audioPlayer" style="display: none;"></audio>
-    </div>
-
-    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js"></script>
-    <script>
-        document.addEventListener('DOMContentLoaded', function() {
-            // DOM elements
-            const recordBtn = document.getElementById('recordBtn');
-            const translateBtn = document.getElementById('translateBtn');
-            const sourceText = document.getElementById('sourceText');
-            const translatedText = document.getElementById('translatedText');
-            const sourceLanguage = document.getElementById('sourceLanguage');
-            const targetLanguage = document.getElementById('targetLanguage');
-            const playSource = document.getElementById('playSource');
-            const playTranslation = document.getElementById('playTranslation');
-            const clearSource = document.getElementById('clearSource');
-            const clearTranslation = document.getElementById('clearTranslation');
-            const statusIndicator = document.getElementById('statusIndicator');
-            const progressContainer = document.getElementById('progressContainer');
-            const progressBar = document.getElementById('progressBar');
-            const audioPlayer = document.getElementById('audioPlayer');
-
-            // Set initial values
-            let isRecording = false;
-            let mediaRecorder = null;
-            let audioChunks = [];
-            let currentSourceText = '';
-            let currentTranslationText = '';
-
-            // Make sure target language is different from source
-            if (targetLanguage.options[0].value === sourceLanguage.value) {
-                targetLanguage.selectedIndex = 1;
-            }
-
-            // Event listeners for language selection
-            sourceLanguage.addEventListener('change', function() {
-                if (targetLanguage.value === sourceLanguage.value) {
-                    for (let i = 0; i < targetLanguage.options.length; i++) {
-                        if (targetLanguage.options[i].value !== sourceLanguage.value) {
-                            targetLanguage.selectedIndex = i;
-                            break;
-                        }
-                    }
-                }
-            });
-
-            targetLanguage.addEventListener('change', function() {
-                if (targetLanguage.value === sourceLanguage.value) {
-                    for (let i = 0; i < sourceLanguage.options.length; i++) {
-                        if (sourceLanguage.options[i].value !== targetLanguage.value) {
-                            sourceLanguage.selectedIndex = i;
-                            break;
-                        }
-                    }
-                }
-            });
-
-            // Record button click event
-            recordBtn.addEventListener('click', function() {
-                if (isRecording) {
-                    stopRecording();
-                } else {
-                    startRecording();
-                }
-            });
-
-            // Function to start recording
-            function startRecording() {
-                navigator.mediaDevices.getUserMedia({ audio: true })
-                    .then(stream => {
-                        mediaRecorder = new MediaRecorder(stream);
-                        audioChunks = [];
-
-                        mediaRecorder.addEventListener('dataavailable', event => {
-                            audioChunks.push(event.data);
-                        });
-
-                        mediaRecorder.addEventListener('stop', () => {
-                            const audioBlob = new Blob(audioChunks, { type: 'audio/wav' });
-                            transcribeAudio(audioBlob);
-                        });
-
-                        mediaRecorder.start();
-                        isRecording = true;
-                        recordBtn.classList.add('recording');
-                        recordBtn.classList.replace('btn-primary', 'btn-danger');
-                        recordBtn.innerHTML = '<i class="fas fa-stop"></i>';
-                        statusIndicator.textContent = 'Recording... Click to stop';
-                    })
-                    .catch(error => {
-                        console.error('Error accessing microphone:', error);
-                        alert('Error accessing microphone. Please make sure you have given permission for microphone access.');
-                    });
-            }
-
-            // Function to stop recording
-            function stopRecording() {
-                mediaRecorder.stop();
-                isRecording = false;
-                recordBtn.classList.remove('recording');
-                recordBtn.classList.replace('btn-danger', 'btn-primary');
-                recordBtn.innerHTML = '<i class="fas fa-microphone"></i>';
-                statusIndicator.textContent = 'Processing audio...';
-                
-                // Stop all audio tracks
-                mediaRecorder.stream.getTracks().forEach(track => track.stop());
-            }
-
-            // Function to transcribe audio
-            function transcribeAudio(audioBlob) {
-                const formData = new FormData();
-                formData.append('audio', audioBlob);
-                formData.append('source_lang', sourceLanguage.value);
-
-                showProgress();
-                
-                fetch('/transcribe', {
-                    method: 'POST',
-                    body: formData
-                })
-                .then(response => response.json())
-                .then(data => {
-                    hideProgress();
-                    
-                    if (data.success) {
-                        currentSourceText = data.text;
-                        sourceText.innerHTML = `<p>${data.text}</p>`;
-                        playSource.disabled = false;
-                        translateBtn.disabled = false;
-                        statusIndicator.textContent = 'Transcription complete';
-                    } else {
-                        sourceText.innerHTML = `<p class="text-danger">Error: ${data.error}</p>`;
-                        statusIndicator.textContent = 'Transcription failed';
-                    }
-                })
-                .catch(error => {
-                    hideProgress();
-                    console.error('Transcription error:', error);
-                    sourceText.innerHTML = `<p class="text-danger">Failed to transcribe audio. Please try again.</p>`;
-                    statusIndicator.textContent = 'Transcription failed';
-                });
-            }
-
-            // Translate button click event
-            translateBtn.addEventListener('click', function() {
-                if (!currentSourceText) {
-                    return;
-                }
-
-                statusIndicator.textContent = 'Translating...';
-                showProgress();
-                
-                fetch('/translate', {
-                    method: 'POST',
-                    headers: {
-                        'Content-Type': 'application/json'
-                    },
-                    body: JSON.stringify({
-                        text: currentSourceText,
-                        source_lang: sourceLanguage.value,
-                        target_lang: targetLanguage.value
-                    })
-                })
-                .then(response => response.json())
-                .then(data => {
-                    hideProgress();
-                    
-                    if (data.success) {
-                        currentTranslationText = data.translation;
-                        translatedText.innerHTML = `<p>${data.translation}</p>`;
-                        playTranslation.disabled = false;
-                        statusIndicator.textContent = 'Translation complete';
-                    } else {
-                        translatedText.innerHTML = `<p class="text-danger">Error: ${data.error}</p>`;
-                        statusIndicator.textContent = 'Translation failed';
-                    }
-                })
-                .catch(error => {
-                    hideProgress();
-                    console.error('Translation error:', error);
-                    translatedText.innerHTML = `<p class="text-danger">Failed to translate. Please try again.</p>`;
-                    statusIndicator.textContent = 'Translation failed';
-                });
-            });
-
-            // Play source text
-            playSource.addEventListener('click', function() {
-                if (!currentSourceText) return;
-                
-                playAudio(currentSourceText, sourceLanguage.value);
-                statusIndicator.textContent = 'Playing source audio...';
-            });
-
-            // Play translation
-            playTranslation.addEventListener('click', function() {
-                if (!currentTranslationText) return;
-                
-                playAudio(currentTranslationText, targetLanguage.value);
-                statusIndicator.textContent = 'Playing translation audio...';
-            });
-
-            // Function to play audio via TTS
-            function playAudio(text, language) {
-                showProgress();
-                
-                fetch('/speak', {
-                    method: 'POST',
-                    headers: {
-                        'Content-Type': 'application/json'
-                    },
-                    body: JSON.stringify({
-                        text: text,
-                        language: language
-                    })
-                })
-                .then(response => response.json())
-                .then(data => {
-                    hideProgress();
-                    
-                    if (data.success) {
-                        audioPlayer.src = data.audio_url;
-                        audioPlayer.onended = function() {
-                            statusIndicator.textContent = 'Ready';
-                        };
-                        audioPlayer.play();
-                    } else {
-                        statusIndicator.textContent = 'TTS failed';
-                        alert('Failed to play audio: ' + data.error);
-                    }
-                })
-                .catch(error => {
-                    hideProgress();
-                    console.error('TTS error:', error);
-                    statusIndicator.textContent = 'TTS failed';
-                });
-            }
-
-            // Clear buttons
-            clearSource.addEventListener('click', function() {
-                sourceText.innerHTML = '<p class="text-muted">Your transcribed text will appear here...</p>';
-                currentSourceText = '';
-                playSource.disabled = true;
-                translateBtn.disabled = true;
-            });
-
-            clearTranslation.addEventListener('click', function() {
-                translatedText.innerHTML = '<p class="text-muted">Translation will appear here...</p>';
-                currentTranslationText = '';
-                playTranslation.disabled = true;
-            });
-
-            // Progress indicator functions
-            function showProgress() {
-                progressContainer.classList.remove('d-none');
-                let progress = 0;
-                const interval = setInterval(() => {
-                    progress += 5;
-                    if (progress > 90) {
-                        clearInterval(interval);
-                    }
-                    progressBar.style.width = `${progress}%`;
-                }, 100);
-                progressBar.dataset.interval = interval;
-            }
-
-            function hideProgress() {
-                const interval = progressBar.dataset.interval;
-                if (interval) {
-                    clearInterval(Number(interval));
-                }
-                progressBar.style.width = '100%';
-                setTimeout(() => {
-                    progressContainer.classList.add('d-none');
-                    progressBar.style.width = '0%';
-                }, 500);
-            }
-        });
-    </script>
-</body>
-</html>
-EOL
-
-# Create app.py
-cat > app.py << 'EOL'
-import os
-import time
-import tempfile
-import requests
-import json
-from flask import Flask, render_template, request, jsonify, Response, send_file
-import whisper
-import torch
-import ollama
-import logging
-
-app = Flask(__name__)
-app.config['UPLOAD_FOLDER'] = tempfile.mkdtemp()
-app.config['TTS_SERVER'] = os.environ.get('TTS_SERVER_URL', 'http://localhost:5050/v1/audio/speech')
-app.config['TTS_API_KEY'] = os.environ.get('TTS_API_KEY', 'your_api_key_here')
-
-# Add a route to check TTS server status
-@app.route('/check_tts_server', methods=['GET'])
-def check_tts_server():
-    try:
-        # Try a simple HTTP request to the TTS server
-        response = requests.get(app.config['TTS_SERVER'].rsplit('/api/generate', 1)[0] + '/status', timeout=5)
-        if response.status_code == 200:
-            return jsonify({
-                'status': 'online',
-                'url': app.config['TTS_SERVER']
-            })
-        else:
-            return jsonify({
-                'status': 'error',
-                'message': f'TTS server returned status code {response.status_code}',
-                'url': app.config['TTS_SERVER']
-            })
-    except requests.exceptions.RequestException as e:
-        return jsonify({
-            'status': 'error',
-            'message': f'Cannot connect to TTS server: {str(e)}',
-            'url': app.config['TTS_SERVER']
-        })
-
-# Initialize logging
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-
-# Load Whisper model
-logger.info("Loading Whisper model...")
-whisper_model = whisper.load_model("base")
-logger.info("Whisper model loaded successfully")
-
-# Supported languages
-SUPPORTED_LANGUAGES = {
-    "ar": "Arabic",
-    "hy": "Armenian",
-    "az": "Azerbaijani",
-    "en": "English",
-    "fr": "French",
-    "ka": "Georgian",
-    "kk": "Kazakh",
-    "zh": "Mandarin",
-    "fa": "Farsi",
-    "pt": "Portuguese",
-    "ru": "Russian",
-    "es": "Spanish",
-    "tr": "Turkish",
-    "uz": "Uzbek"
-}
-
-# Map language names to language codes
-LANGUAGE_TO_CODE = {v: k for k, v in SUPPORTED_LANGUAGES.items()}
-
-# Map language names to OpenAI TTS voice options
-LANGUAGE_TO_VOICE = {
-    "Arabic": "alloy",      # Using OpenAI general voices 
-    "Armenian": "echo",     # as OpenAI doesn't have specific voices
-    "Azerbaijani": "nova",  # for all these languages
-    "English": "echo",      # We'll use the available voices
-    "French": "alloy",      # and rely on the translation being
-    "Georgian": "fable",    # in the correct language text
-    "Kazakh": "onyx",
-    "Mandarin": "shimmer",
-    "Farsi": "nova",
-    "Portuguese": "alloy",
-    "Russian": "echo",
-    "Spanish": "nova",
-    "Turkish": "fable",
-    "Uzbek": "onyx"
-}
-
-@app.route('/')
-def index():
-    return render_template('index.html', languages=sorted(SUPPORTED_LANGUAGES.values()))
-
-@app.route('/transcribe', methods=['POST'])
-def transcribe():
-    if 'audio' not in request.files:
-        return jsonify({'error': 'No audio file provided'}), 400
-    
-    audio_file = request.files['audio']
-    source_lang = request.form.get('source_lang', '')
-    
-    # Save the audio file temporarily
-    temp_path = os.path.join(app.config['UPLOAD_FOLDER'], 'input_audio.wav')
-    audio_file.save(temp_path)
-    
-    try:
-        # Use Whisper for transcription
-        result = whisper_model.transcribe(
-            temp_path, 
-            language=LANGUAGE_TO_CODE.get(source_lang, None)
-        )
-        transcribed_text = result["text"]
-        
-        return jsonify({
-            'success': True,
-            'text': transcribed_text
-        })
-    except Exception as e:
-        logger.error(f"Transcription error: {str(e)}")
-        return jsonify({'error': f'Transcription failed: {str(e)}'}), 500
-    finally:
-        # Clean up the temporary file
-        if os.path.exists(temp_path):
-            os.remove(temp_path)
-
-@app.route('/translate', methods=['POST'])
-def translate():
-    try:
-        data = request.json
-        text = data.get('text', '')
-        source_lang = data.get('source_lang', '')
-        target_lang = data.get('target_lang', '')
-        
-        if not text or not source_lang or not target_lang:
-            return jsonify({'error': 'Missing required parameters'}), 400
-        
-        # Create a prompt for Gemma 3 translation
-        prompt = f"""
-        Translate the following text from {source_lang} to {target_lang}:
-        
-        "{text}"
-        
-        Provide only the translation without any additional text.
-        """
-        
-        # Use Ollama to interact with Gemma 3
-        response = ollama.chat(
-            model="gemma3",
-            messages=[
-                {
-                    "role": "user",
-                    "content": prompt
-                }
-            ]
-        )
-        
-        translated_text = response['message']['content'].strip()
-        
-        return jsonify({
-            'success': True,
-            'translation': translated_text
-        })
-    except Exception as e:
-        logger.error(f"Translation error: {str(e)}")
-        return jsonify({'error': f'Translation failed: {str(e)}'}), 500
-
-@app.route('/speak', methods=['POST'])
-def speak():
-    try:
-        data = request.json
-        text = data.get('text', '')
-        language = data.get('language', '')
-        
-        if not text or not language:
-            return jsonify({'error': 'Missing required parameters'}), 400
-        
-        voice = LANGUAGE_TO_VOICE.get(language)
-        if not voice:
-            return jsonify({'error': 'Unsupported language for TTS'}), 400
-        
-        # Get TTS server URL from environment or config
-        tts_server_url = app.config['TTS_SERVER']
-        
-        try:
-            # Request TTS from the Edge TTS server
-            logger.info(f"Sending TTS request to {tts_server_url}")
-            tts_response = requests.post(
-                tts_server_url,
-                json={
-                    'text': text,
-                    'voice': voice,
-                    'output_format': 'mp3'
-                },
-                timeout=10  # Add timeout
-            )
-            
-            logger.info(f"TTS response status: {tts_response.status_code}")
-            
-            if tts_response.status_code != 200:
-                error_msg = f'TTS request failed with status {tts_response.status_code}'
-                logger.error(error_msg)
-                
-                # Try to get error details from response if possible
-                try:
-                    error_details = tts_response.json()
-                    logger.error(f"Error details: {error_details}")
-                except:
-                    pass
-                    
-                return jsonify({'error': error_msg}), 500
-            
-            # The response contains the audio data directly
-            temp_audio_path = os.path.join(app.config['UPLOAD_FOLDER'], f'output_{int(time.time())}.mp3')
-            with open(temp_audio_path, 'wb') as f:
-                f.write(tts_response.content)
-            
-            return jsonify({
-                'success': True,
-                'audio_url': f'/get_audio/{os.path.basename(temp_audio_path)}'
-            })
-        except requests.exceptions.RequestException as e:
-            error_msg = f'Failed to connect to TTS server: {str(e)}'
-            logger.error(error_msg)
-            return jsonify({'error': error_msg}), 500
-    except Exception as e:
-        logger.error(f"TTS error: {str(e)}")
-        return jsonify({'error': f'TTS failed: {str(e)}'}), 500
-
-@app.route('/get_audio/<filename>')
-def get_audio(filename):
-    try:
-        file_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
-        return send_file(file_path, mimetype='audio/mpeg')
-    except Exception as e:
-        logger.error(f"Audio retrieval error: {str(e)}")
-        return jsonify({'error': f'Audio retrieval failed: {str(e)}'}), 500
-
-if __name__ == '__main__':
-    app.run(host='0.0.0.0', port=8000, debug=True)
-EOL
-
-# Create requirements.txt
-cat > requirements.txt << 'EOL'
-flask==2.3.2
-requests==2.31.0
-openai-whisper==20231117
-torch==2.1.0
-ollama==0.1.5
-EOL
-
-# Create README.md
-cat > README.md << 'EOL'
-# Voice Language Translator
-
-A mobile-friendly web application that translates spoken language between multiple languages using:
- Gemma 3 open-source LLM via Ollama for translation
- OpenAI Whisper for speech-to-text
- OpenAI Edge TTS for text-to-speech
-
-## Supported Languages
-
- Arabic
- Armenian
- Azerbaijani
- English
- French
- Georgian
- Kazakh
- Mandarin
- Farsi
- Portuguese
- Russian
- Spanish
- Turkish
- Uzbek
-
-## Setup Instructions
-
-1. Install the required Python packages:
-   ```
-   pip install -r requirements.txt
-   ```
-
-2. Make sure you have Ollama installed and the Gemma 3 model loaded:
-   ```
-   ollama pull gemma3
-   ```
-
-3. Ensure your OpenAI Edge TTS server is running on port 5050.
-
-4. Run the application:
-   ```
-   python app.py
-   ```
-
-5. Open your browser and navigate to:
-   ```
-   http://localhost:8000
-   ```
-
-## Usage
-
-1. Select your source language from the dropdown menu
-2. Press the microphone button and speak
-3. Press the button again to stop recording
-4. Wait for the transcription to complete
-5. Select your target language
-6. Press the "Translate" button
-7. Use the play buttons to hear the original or translated text
-
-## Technical Details
-
- The app uses Flask for the web server
- Audio is processed client-side using the MediaRecorder API
- Whisper for speech recognition with language hints
- Ollama provides access to the Gemma 3 model for translation
- OpenAI Edge TTS delivers natural-sounding speech output
-
-## Mobile Support
-
-The interface is fully responsive and designed to work well on mobile devices.
-EOL
-
-# Make the script executable
-chmod +x app.py
-
-echo "Setup complete! Run the app with: python app.py"
--- a/static/js/app.js
+++ b/static/js/app.js
--- a/test-cors.html
+++ b/test-cors.html
@@ -1,228 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>CORS Test for Talk2Me</title>
-    <style>
-        body {
-            font-family: Arial, sans-serif;
-            max-width: 800px;
-            margin: 50px auto;
-            padding: 20px;
-        }
-        .test-result {
-            margin: 10px 0;
-            padding: 10px;
-            border-radius: 5px;
-        }
-        .success {
-            background-color: #d4edda;
-            color: #155724;
-            border: 1px solid #c3e6cb;
-        }
-        .error {
-            background-color: #f8d7da;
-            color: #721c24;
-            border: 1px solid #f5c6cb;
-        }
-        button {
-            background-color: #007bff;
-            color: white;
-            padding: 10px 20px;
-            border: none;
-            border-radius: 5px;
-            cursor: pointer;
-            margin: 5px;
-        }
-        button:hover {
-            background-color: #0056b3;
-        }
-        input {
-            width: 100%;
-            padding: 10px;
-            margin: 10px 0;
-            border: 1px solid #ddd;
-            border-radius: 5px;
-        }
-        #results {
-            margin-top: 20px;
-        }
-        pre {
-            background-color: #f8f9fa;
-            padding: 10px;
-            border-radius: 5px;
-            overflow-x: auto;
-        }
-    </style>
-</head>
-<body>
-    <h1>CORS Test for Talk2Me API</h1>
-    
-    <p>This page tests CORS configuration for the Talk2Me API. Open this file from a different origin (e.g., file:// or a different port) to test cross-origin requests.</p>
-    
-    <div>
-        <label for="apiUrl">API Base URL:</label>
-        <input type="text" id="apiUrl" placeholder="http://localhost:5005" value="http://localhost:5005">
-    </div>
-    
-    <h2>Tests:</h2>
-    
-    <button onclick="testHealthEndpoint()">Test Health Endpoint</button>
-    <button onclick="testPreflightRequest()">Test Preflight Request</button>
-    <button onclick="testTranscribeEndpoint()">Test Transcribe Endpoint (OPTIONS)</button>
-    <button onclick="testWithCredentials()">Test With Credentials</button>
-    
-    <div id="results"></div>
-    
-    <script>
-        function addResult(test, success, message, details = null) {
-            const resultsDiv = document.getElementById('results');
-            const resultDiv = document.createElement('div');
-            resultDiv.className = `test-result ${success ? 'success' : 'error'}`;
-            
-            let html = `<strong>${test}:</strong> ${message}`;
-            if (details) {
-                html += `<pre>${JSON.stringify(details, null, 2)}</pre>`;
-            }
-            resultDiv.innerHTML = html;
-            resultsDiv.appendChild(resultDiv);
-        }
-        
-        function getApiUrl() {
-            return document.getElementById('apiUrl').value.trim();
-        }
-        
-        async function testHealthEndpoint() {
-            const apiUrl = getApiUrl();
-            try {
-                const response = await fetch(`${apiUrl}/health`, {
-                    method: 'GET',
-                    mode: 'cors',
-                    headers: {
-                        'Origin': window.location.origin
-                    }
-                });
-                
-                const data = await response.json();
-                
-                // Check CORS headers
-                const corsHeaders = {
-                    'Access-Control-Allow-Origin': response.headers.get('Access-Control-Allow-Origin'),
-                    'Access-Control-Allow-Credentials': response.headers.get('Access-Control-Allow-Credentials')
-                };
-                
-                addResult('Health Endpoint GET', true, 'Request successful', {
-                    status: response.status,
-                    data: data,
-                    corsHeaders: corsHeaders
-                });
-            } catch (error) {
-                addResult('Health Endpoint GET', false, error.message);
-            }
-        }
-        
-        async function testPreflightRequest() {
-            const apiUrl = getApiUrl();
-            try {
-                const response = await fetch(`${apiUrl}/api/push-public-key`, {
-                    method: 'OPTIONS',
-                    mode: 'cors',
-                    headers: {
-                        'Origin': window.location.origin,
-                        'Access-Control-Request-Method': 'GET',
-                        'Access-Control-Request-Headers': 'content-type'
-                    }
-                });
-                
-                const corsHeaders = {
-                    'Access-Control-Allow-Origin': response.headers.get('Access-Control-Allow-Origin'),
-                    'Access-Control-Allow-Methods': response.headers.get('Access-Control-Allow-Methods'),
-                    'Access-Control-Allow-Headers': response.headers.get('Access-Control-Allow-Headers'),
-                    'Access-Control-Max-Age': response.headers.get('Access-Control-Max-Age')
-                };
-                
-                addResult('Preflight Request', response.ok, `Status: ${response.status}`, corsHeaders);
-            } catch (error) {
-                addResult('Preflight Request', false, error.message);
-            }
-        }
-        
-        async function testTranscribeEndpoint() {
-            const apiUrl = getApiUrl();
-            try {
-                const response = await fetch(`${apiUrl}/transcribe`, {
-                    method: 'OPTIONS',
-                    mode: 'cors',
-                    headers: {
-                        'Origin': window.location.origin,
-                        'Access-Control-Request-Method': 'POST',
-                        'Access-Control-Request-Headers': 'content-type'
-                    }
-                });
-                
-                const corsHeaders = {
-                    'Access-Control-Allow-Origin': response.headers.get('Access-Control-Allow-Origin'),
-                    'Access-Control-Allow-Methods': response.headers.get('Access-Control-Allow-Methods'),
-                    'Access-Control-Allow-Headers': response.headers.get('Access-Control-Allow-Headers'),
-                    'Access-Control-Allow-Credentials': response.headers.get('Access-Control-Allow-Credentials')
-                };
-                
-                addResult('Transcribe Endpoint OPTIONS', response.ok, `Status: ${response.status}`, corsHeaders);
-            } catch (error) {
-                addResult('Transcribe Endpoint OPTIONS', false, error.message);
-            }
-        }
-        
-        async function testWithCredentials() {
-            const apiUrl = getApiUrl();
-            try {
-                const response = await fetch(`${apiUrl}/health`, {
-                    method: 'GET',
-                    mode: 'cors',
-                    credentials: 'include',
-                    headers: {
-                        'Origin': window.location.origin
-                    }
-                });
-                
-                const data = await response.json();
-                
-                addResult('Request with Credentials', true, 'Request successful', {
-                    status: response.status,
-                    credentialsIncluded: true,
-                    data: data
-                });
-            } catch (error) {
-                addResult('Request with Credentials', false, error.message);
-            }
-        }
-        
-        // Clear results before running new tests
-        function clearResults() {
-            document.getElementById('results').innerHTML = '';
-        }
-        
-        // Add event listeners
-        document.querySelectorAll('button').forEach(button => {
-            button.addEventListener('click', (e) => {
-                if (!e.target.textContent.includes('Test')) return;
-                clearResults();
-            });
-        });
-        
-        // Show current origin
-        window.addEventListener('load', () => {
-            const info = document.createElement('div');
-            info.style.marginBottom = '20px';
-            info.style.padding = '10px';
-            info.style.backgroundColor = '#e9ecef';
-            info.style.borderRadius = '5px';
-            info.innerHTML = `<strong>Current Origin:</strong> ${window.location.origin}<br>
-                             <strong>Protocol:</strong> ${window.location.protocol}<br>
-                             <strong>Note:</strong> For effective CORS testing, open this file from a different origin than your API server.`;
-            document.body.insertBefore(info, document.querySelector('h2'));
-        });
-    </script>
-</body>
-</html>
--- a/test_error_logging.py
+++ b/test_error_logging.py
@@ -1,168 +0,0 @@
-#!/usr/bin/env python3
-"""
-Test script for error logging system
-"""
-import logging
-import json
-import os
-import time
-from error_logger import ErrorLogger, log_errors, log_performance, get_logger
-
-def test_basic_logging():
-    """Test basic logging functionality"""
-    print("\n=== Testing Basic Logging ===")
-    
-    # Get logger
-    logger = get_logger('test')
-    
-    # Test different log levels
-    logger.debug("This is a debug message")
-    logger.info("This is an info message")
-    logger.warning("This is a warning message")
-    logger.error("This is an error message")
-    
-    print("✓ Basic logging test completed")
-
-def test_error_logging():
-    """Test error logging with exceptions"""
-    print("\n=== Testing Error Logging ===")
-    
-    @log_errors('test.functions')
-    def failing_function():
-        raise ValueError("This is a test error")
-    
-    try:
-        failing_function()
-    except ValueError:
-        print("✓ Error was logged")
-    
-    # Check if error log exists
-    if os.path.exists('logs/errors.log'):
-        print("✓ Error log file created")
-        
-        # Read last line
-        with open('logs/errors.log', 'r') as f:
-            lines = f.readlines()
-            if lines:
-                try:
-                    error_entry = json.loads(lines[-1])
-                    print(f"✓ Error logged with level: {error_entry.get('level')}")
-                    print(f"✓ Error type: {error_entry.get('exception', {}).get('type')}")
-                except json.JSONDecodeError:
-                    print("✗ Error log entry is not valid JSON")
-    else:
-        print("✗ Error log file not created")
-
-def test_performance_logging():
-    """Test performance logging"""
-    print("\n=== Testing Performance Logging ===")
-    
-    @log_performance('test_operation')
-    def slow_function():
-        time.sleep(0.1)  # Simulate slow operation
-        return "result"
-    
-    result = slow_function()
-    print(f"✓ Function returned: {result}")
-    
-    # Check performance log
-    if os.path.exists('logs/performance.log'):
-        print("✓ Performance log file created")
-        
-        # Read last line
-        with open('logs/performance.log', 'r') as f:
-            lines = f.readlines()
-            if lines:
-                try:
-                    perf_entry = json.loads(lines[-1])
-                    duration = perf_entry.get('extra_fields', {}).get('duration_ms', 0)
-                    print(f"✓ Performance logged with duration: {duration}ms")
-                except json.JSONDecodeError:
-                    print("✗ Performance log entry is not valid JSON")
-    else:
-        print("✗ Performance log file not created")
-
-def test_structured_logging():
-    """Test structured logging format"""
-    print("\n=== Testing Structured Logging ===")
-    
-    logger = get_logger('test.structured')
-    
-    # Log with extra fields
-    logger.info("Structured log test", extra={
-        'extra_fields': {
-            'user_id': 123,
-            'action': 'test_action',
-            'metadata': {'key': 'value'}
-        }
-    })
-    
-    # Check main log
-    if os.path.exists('logs/talk2me.log'):
-        with open('logs/talk2me.log', 'r') as f:
-            lines = f.readlines()
-            if lines:
-                try:
-                    # Find our test entry
-                    for line in reversed(lines):
-                        entry = json.loads(line)
-                        if entry.get('message') == 'Structured log test':
-                            print("✓ Structured log entry found")
-                            print(f"✓ Contains timestamp: {'timestamp' in entry}")
-                            print(f"✓ Contains hostname: {'hostname' in entry}")
-                            print(f"✓ Contains extra fields: {'user_id' in entry}")
-                            break
-                except json.JSONDecodeError:
-                    print("✗ Log entry is not valid JSON")
-
-def test_log_rotation():
-    """Test log rotation settings"""
-    print("\n=== Testing Log Rotation ===")
-    
-    # Check if log files exist and their sizes
-    log_files = {
-        'talk2me.log': 'logs/talk2me.log',
-        'errors.log': 'logs/errors.log',
-        'access.log': 'logs/access.log',
-        'security.log': 'logs/security.log',
-        'performance.log': 'logs/performance.log'
-    }
-    
-    for name, path in log_files.items():
-        if os.path.exists(path):
-            size = os.path.getsize(path)
-            print(f"✓ {name}: {size} bytes")
-        else:
-            print(f"- {name}: not created yet")
-
-def main():
-    """Run all tests"""
-    print("Error Logging System Tests")
-    print("==========================")
-    
-    # Create a test Flask app
-    from flask import Flask
-    app = Flask(__name__)
-    app.config['LOG_LEVEL'] = 'DEBUG'
-    app.config['FLASK_ENV'] = 'testing'
-    
-    # Initialize error logger
-    error_logger = ErrorLogger(app)
-    
-    # Run tests
-    test_basic_logging()
-    test_error_logging()
-    test_performance_logging()
-    test_structured_logging()
-    test_log_rotation()
-    
-    print("\n✅ All tests completed!")
-    print("\nCheck the logs directory for generated log files:")
-    print("- logs/talk2me.log - Main application log")
-    print("- logs/errors.log - Error log with stack traces")
-    print("- logs/performance.log - Performance metrics")
-    print("- logs/access.log - HTTP access log")
-    print("- logs/security.log - Security events")
-
-if __name__ == "__main__":
-    main()
--- a/test_session_manager.py
+++ b/test_session_manager.py
@@ -1,264 +0,0 @@
-#!/usr/bin/env python3
-"""
-Unit tests for session management system
-"""
-import unittest
-import tempfile
-import shutil
-import time
-import os
-from session_manager import SessionManager, UserSession, SessionResource
-from flask import Flask, g, session
-
-class TestSessionManager(unittest.TestCase):
-    def setUp(self):
-        """Set up test fixtures"""
-        self.temp_dir = tempfile.mkdtemp()
-        self.config = {
-            'max_session_duration': 3600,
-            'max_idle_time': 900,
-            'max_resources_per_session': 5,  # Small limit for testing
-            'max_bytes_per_session': 1024 * 1024,  # 1MB for testing
-            'cleanup_interval': 1,  # 1 second for faster testing
-            'session_storage_path': self.temp_dir
-        }
-        self.manager = SessionManager(self.config)
-    
-    def tearDown(self):
-        """Clean up test fixtures"""
-        shutil.rmtree(self.temp_dir, ignore_errors=True)
-    
-    def test_create_session(self):
-        """Test session creation"""
-        session = self.manager.create_session(
-            session_id='test-123',
-            user_id='user-1',
-            ip_address='127.0.0.1',
-            user_agent='Test Agent'
-        )
-        
-        self.assertEqual(session.session_id, 'test-123')
-        self.assertEqual(session.user_id, 'user-1')
-        self.assertEqual(session.ip_address, '127.0.0.1')
-        self.assertEqual(session.user_agent, 'Test Agent')
-        self.assertEqual(len(session.resources), 0)
-    
-    def test_get_session(self):
-        """Test session retrieval"""
-        self.manager.create_session(session_id='test-456')
-        session = self.manager.get_session('test-456')
-        
-        self.assertIsNotNone(session)
-        self.assertEqual(session.session_id, 'test-456')
-        
-        # Non-existent session
-        session = self.manager.get_session('non-existent')
-        self.assertIsNone(session)
-    
-    def test_add_resource(self):
-        """Test adding resources to session"""
-        self.manager.create_session(session_id='test-789')
-        
-        # Add a resource
-        resource = self.manager.add_resource(
-            session_id='test-789',
-            resource_type='audio_file',
-            resource_id='audio-1',
-            path='/tmp/test.wav',
-            size_bytes=1024,
-            metadata={'format': 'wav'}
-        )
-        
-        self.assertIsNotNone(resource)
-        self.assertEqual(resource.resource_id, 'audio-1')
-        self.assertEqual(resource.resource_type, 'audio_file')
-        self.assertEqual(resource.size_bytes, 1024)
-        
-        # Check session updated
-        session = self.manager.get_session('test-789')
-        self.assertEqual(len(session.resources), 1)
-        self.assertEqual(session.total_bytes_used, 1024)
-    
-    def test_resource_limits(self):
-        """Test resource limit enforcement"""
-        self.manager.create_session(session_id='test-limits')
-        
-        # Add resources up to limit
-        for i in range(5):
-            self.manager.add_resource(
-                session_id='test-limits',
-                resource_type='temp_file',
-                resource_id=f'file-{i}',
-                size_bytes=100
-            )
-        
-        session = self.manager.get_session('test-limits')
-        self.assertEqual(len(session.resources), 5)
-        
-        # Add one more - should remove oldest
-        self.manager.add_resource(
-            session_id='test-limits',
-            resource_type='temp_file',
-            resource_id='file-new',
-            size_bytes=100
-        )
-        
-        session = self.manager.get_session('test-limits')
-        self.assertEqual(len(session.resources), 5)  # Still 5
-        self.assertNotIn('file-0', session.resources)  # Oldest removed
-        self.assertIn('file-new', session.resources)  # New one added
-    
-    def test_size_limits(self):
-        """Test size limit enforcement"""
-        self.manager.create_session(session_id='test-size')
-        
-        # Add a large resource
-        self.manager.add_resource(
-            session_id='test-size',
-            resource_type='audio_file',
-            resource_id='large-1',
-            size_bytes=500 * 1024  # 500KB
-        )
-        
-        # Add another large resource
-        self.manager.add_resource(
-            session_id='test-size',
-            resource_type='audio_file',
-            resource_id='large-2',
-            size_bytes=600 * 1024  # 600KB - would exceed 1MB limit
-        )
-        
-        session = self.manager.get_session('test-size')
-        # First resource should be removed to make space
-        self.assertNotIn('large-1', session.resources)
-        self.assertIn('large-2', session.resources)
-        self.assertLessEqual(session.total_bytes_used, 1024 * 1024)
-    
-    def test_remove_resource(self):
-        """Test resource removal"""
-        self.manager.create_session(session_id='test-remove')
-        self.manager.add_resource(
-            session_id='test-remove',
-            resource_type='temp_file',
-            resource_id='to-remove',
-            size_bytes=1000
-        )
-        
-        # Remove resource
-        success = self.manager.remove_resource('test-remove', 'to-remove')
-        self.assertTrue(success)
-        
-        # Check it's gone
-        session = self.manager.get_session('test-remove')
-        self.assertEqual(len(session.resources), 0)
-        self.assertEqual(session.total_bytes_used, 0)
-    
-    def test_cleanup_session(self):
-        """Test session cleanup"""
-        # Create session with resources
-        self.manager.create_session(session_id='test-cleanup')
-        
-        # Create actual temp file
-        temp_file = os.path.join(self.temp_dir, 'test-file.txt')
-        with open(temp_file, 'w') as f:
-            f.write('test content')
-        
-        self.manager.add_resource(
-            session_id='test-cleanup',
-            resource_type='temp_file',
-            path=temp_file,
-            size_bytes=12
-        )
-        
-        # Cleanup session
-        success = self.manager.cleanup_session('test-cleanup')
-        self.assertTrue(success)
-        
-        # Check session is gone
-        session = self.manager.get_session('test-cleanup')
-        self.assertIsNone(session)
-        
-        # Check file is deleted
-        self.assertFalse(os.path.exists(temp_file))
-    
-    def test_session_info(self):
-        """Test session info retrieval"""
-        self.manager.create_session(
-            session_id='test-info',
-            ip_address='192.168.1.1'
-        )
-        
-        self.manager.add_resource(
-            session_id='test-info',
-            resource_type='audio_file',
-            size_bytes=2048
-        )
-        
-        info = self.manager.get_session_info('test-info')
-        self.assertIsNotNone(info)
-        self.assertEqual(info['session_id'], 'test-info')
-        self.assertEqual(info['ip_address'], '192.168.1.1')
-        self.assertEqual(info['resource_count'], 1)
-        self.assertEqual(info['total_bytes_used'], 2048)
-    
-    def test_stats(self):
-        """Test statistics calculation"""
-        # Create multiple sessions
-        for i in range(3):
-            self.manager.create_session(session_id=f'test-stats-{i}')
-            self.manager.add_resource(
-                session_id=f'test-stats-{i}',
-                resource_type='temp_file',
-                size_bytes=1000
-            )
-        
-        stats = self.manager.get_stats()
-        self.assertEqual(stats['active_sessions'], 3)
-        self.assertEqual(stats['active_resources'], 3)
-        self.assertEqual(stats['active_bytes'], 3000)
-        self.assertEqual(stats['total_sessions_created'], 3)
-    
-    def test_metrics_export(self):
-        """Test metrics export"""
-        self.manager.create_session(session_id='test-metrics')
-        metrics = self.manager.export_metrics()
-        
-        self.assertIn('sessions', metrics)
-        self.assertIn('resources', metrics)
-        self.assertIn('limits', metrics)
-        self.assertEqual(metrics['sessions']['active'], 1)
-
-class TestFlaskIntegration(unittest.TestCase):
-    def setUp(self):
-        """Set up Flask app for testing"""
-        self.app = Flask(__name__)
-        self.app.config['TESTING'] = True
-        self.app.config['SECRET_KEY'] = 'test-secret'
-        self.temp_dir = tempfile.mkdtemp()
-        self.app.config['UPLOAD_FOLDER'] = self.temp_dir
-        
-        # Initialize session manager
-        from session_manager import init_app
-        init_app(self.app)
-        
-        self.client = self.app.test_client()
-        self.ctx = self.app.test_request_context()
-        self.ctx.push()
-    
-    def tearDown(self):
-        """Clean up"""
-        self.ctx.pop()
-        shutil.rmtree(self.temp_dir, ignore_errors=True)
-    
-    def test_before_request_handler(self):
-        """Test Flask before_request integration"""
-        with self.client:
-            # Make a request
-            response = self.client.get('/')
-            
-            # Session should be created
-            with self.client.session_transaction() as sess:
-                self.assertIn('session_id', sess)
-
-if __name__ == '__main__':
-    unittest.main()
--- a/test_size_limits.py
+++ b/test_size_limits.py
@@ -1,146 +0,0 @@
-#!/usr/bin/env python3
-"""
-Test script for request size limits
-"""
-import requests
-import json
-import io
-import os
-
-BASE_URL = "http://localhost:5005"
-
-def test_json_size_limit():
-    """Test JSON payload size limit"""
-    print("\n=== Testing JSON Size Limit ===")
-    
-    # Create a large JSON payload (over 1MB)
-    large_data = {
-        "text": "x" * (2 * 1024 * 1024),  # 2MB of text
-        "source_lang": "English",
-        "target_lang": "Spanish"
-    }
-    
-    try:
-        response = requests.post(f"{BASE_URL}/translate", json=large_data)
-        print(f"Status: {response.status_code}")
-        if response.status_code == 413:
-            print(f"✓ Correctly rejected large JSON: {response.json()}")
-        else:
-            print(f"✗ Should have rejected large JSON")
-    except Exception as e:
-        print(f"Error: {e}")
-
-def test_audio_size_limit():
-    """Test audio file size limit"""
-    print("\n=== Testing Audio Size Limit ===")
-    
-    # Create a fake large audio file (over 25MB)
-    large_audio = io.BytesIO(b"x" * (30 * 1024 * 1024))  # 30MB
-    
-    files = {
-        'audio': ('large_audio.wav', large_audio, 'audio/wav')
-    }
-    data = {
-        'source_lang': 'English'
-    }
-    
-    try:
-        response = requests.post(f"{BASE_URL}/transcribe", files=files, data=data)
-        print(f"Status: {response.status_code}")
-        if response.status_code == 413:
-            print(f"✓ Correctly rejected large audio: {response.json()}")
-        else:
-            print(f"✗ Should have rejected large audio")
-    except Exception as e:
-        print(f"Error: {e}")
-
-def test_valid_requests():
-    """Test that valid-sized requests are accepted"""
-    print("\n=== Testing Valid Size Requests ===")
-    
-    # Small JSON payload
-    small_data = {
-        "text": "Hello world",
-        "source_lang": "English", 
-        "target_lang": "Spanish"
-    }
-    
-    try:
-        response = requests.post(f"{BASE_URL}/translate", json=small_data)
-        print(f"Small JSON - Status: {response.status_code}")
-        if response.status_code != 413:
-            print("✓ Small JSON accepted")
-        else:
-            print("✗ Small JSON should be accepted")
-    except Exception as e:
-        print(f"Error: {e}")
-    
-    # Small audio file
-    small_audio = io.BytesIO(b"RIFF" + b"x" * 1000)  # 1KB fake WAV
-    files = {
-        'audio': ('small_audio.wav', small_audio, 'audio/wav')
-    }
-    data = {
-        'source_lang': 'English'
-    }
-    
-    try:
-        response = requests.post(f"{BASE_URL}/transcribe", files=files, data=data)
-        print(f"Small audio - Status: {response.status_code}")
-        if response.status_code != 413:
-            print("✓ Small audio accepted")
-        else:
-            print("✗ Small audio should be accepted")
-    except Exception as e:
-        print(f"Error: {e}")
-
-def test_admin_endpoints():
-    """Test admin endpoints for size limits"""
-    print("\n=== Testing Admin Endpoints ===")
-    
-    headers = {'X-Admin-Token': os.environ.get('ADMIN_TOKEN', 'default-admin-token')}
-    
-    # Get current limits
-    try:
-        response = requests.get(f"{BASE_URL}/admin/size-limits", headers=headers)
-        print(f"Get limits - Status: {response.status_code}")
-        if response.status_code == 200:
-            limits = response.json()
-            print(f"✓ Current limits: {limits['limits_human']}")
-        else:
-            print(f"✗ Failed to get limits: {response.text}")
-    except Exception as e:
-        print(f"Error: {e}")
-    
-    # Update limits
-    new_limits = {
-        "max_audio_size": "30MB",
-        "max_json_size": 2097152  # 2MB in bytes
-    }
-    
-    try:
-        response = requests.post(f"{BASE_URL}/admin/size-limits", 
-                               json=new_limits, headers=headers)
-        print(f"\nUpdate limits - Status: {response.status_code}")
-        if response.status_code == 200:
-            result = response.json()
-            print(f"✓ Updated limits: {result['new_limits_human']}")
-        else:
-            print(f"✗ Failed to update limits: {response.text}")
-    except Exception as e:
-        print(f"Error: {e}")
-
-if __name__ == "__main__":
-    print("Request Size Limit Tests")
-    print("========================")
-    print(f"Testing against: {BASE_URL}")
-    print("\nMake sure the Flask app is running on port 5005")
-    
-    input("\nPress Enter to start tests...")
-    
-    test_valid_requests()
-    test_json_size_limit()
-    test_audio_size_limit()
-    test_admin_endpoints()
-    
-    print("\n✅ All tests completed!")
--- a/tts-debug-script.py
+++ b/tts-debug-script.py
@@ -1,78 +0,0 @@
-#!/usr/bin/env python
-"""
-TTS Debug Script - Tests connection to the OpenAI TTS server
-"""
-
-import os
-import sys
-import json
-import requests
-from argparse import ArgumentParser
-
-def test_tts_connection(server_url, api_key, text="Hello, this is a test message"):
-    """Test connection to the TTS server"""
-    
-    headers = {
-        "Content-Type": "application/json",
-        "Authorization": f"Bearer {api_key}"
-    }
-    
-    payload = {
-        "input": text,
-        "voice": "echo",
-        "response_format": "mp3",
-        "speed": 1.0
-    }
-    
-    print(f"Sending request to: {server_url}")
-    print(f"Headers: {headers}")
-    print(f"Payload: {json.dumps(payload, indent=2)}")
-    
-    try:
-        response = requests.post(
-            server_url,
-            headers=headers,
-            json=payload,
-            timeout=15
-        )
-        
-        print(f"Response status code: {response.status_code}")
-        
-        if response.status_code == 200:
-            print("Success! Received audio data")
-            # Save to file
-            output_file = "tts_test_output.mp3"
-            with open(output_file, "wb") as f:
-                f.write(response.content)
-            print(f"Saved audio to {output_file}")
-            return True
-        else:
-            print("Error in response")
-            try:
-                error_data = response.json()
-                print(f"Error details: {json.dumps(error_data, indent=2)}")
-            except:
-                print(f"Raw response: {response.text[:500]}")
-            return False
-            
-    except Exception as e:
-        print(f"Error during request: {str(e)}")
-        return False
-
-def main():
-    parser = ArgumentParser(description="Test connection to OpenAI TTS server")
-    parser.add_argument("--url", default="http://localhost:5050/v1/audio/speech", help="TTS server URL")
-    parser.add_argument("--key", default=os.environ.get("TTS_API_KEY", ""), help="API key")
-    parser.add_argument("--text", default="Hello, this is a test message", help="Text to synthesize")
-    
-    args = parser.parse_args()
-    
-    if not args.key:
-        print("Error: API key is required. Use --key argument or set TTS_API_KEY environment variable.")
-        return 1
-        
-    success = test_tts_connection(args.url, args.key, args.text)
-    return 0 if success else 1
-
-if __name__ == "__main__":
-    sys.exit(main())
--- a/tts_test_output.mp3
+++ b/tts_test_output.mp3