# Talk2Me - Real-Time Voice Language Translator A production-ready, mobile-friendly web application that provides real-time translation of spoken language between multiple languages. ## Features - **Real-time Speech Recognition**: Powered by OpenAI Whisper with GPU acceleration - **Advanced Translation**: Using Gemma 3 open-source LLM via Ollama - **Natural Text-to-Speech**: OpenAI Edge TTS for lifelike voice output - **Progressive Web App**: Full offline support with service workers - **Multi-Speaker Support**: Track and translate conversations with multiple participants - **Enterprise Security**: Comprehensive rate limiting, session management, and encrypted secrets - **Production Ready**: Docker support, load balancing, and extensive monitoring - **Admin Dashboard**: Real-time analytics, performance monitoring, and system health tracking ## Table of Contents - [Supported Languages](#supported-languages) - [Quick Start](#quick-start) - [Installation](#installation) - [Configuration](#configuration) - [Security Features](#security-features) - [Production Deployment](#production-deployment) - [API Documentation](#api-documentation) - [Development](#development) - [Monitoring & Operations](#monitoring--operations) - [Troubleshooting](#troubleshooting) - [Contributing](#contributing) ## Supported Languages - Arabic - Armenian - Azerbaijani - English - French - Georgian - Kazakh - Mandarin - Farsi - Portuguese - Russian - Spanish - Turkish - Uzbek ## Quick Start ```bash # Clone the repository git clone https://github.com/yourusername/talk2me.git cd talk2me # Install dependencies pip install -r requirements.txt npm install # Initialize secure configuration python manage_secrets.py init python manage_secrets.py set TTS_API_KEY your-api-key-here # Ensure Ollama is running with Gemma ollama pull gemma2:9b ollama pull gemma3:27b # Start the application python app.py ``` Open your browser and navigate to `http://localhost:5005` ## Installation ### Prerequisites - Python 3.8+ - Node.js 14+ - Ollama (for LLM translation) - OpenAI Edge TTS server - Optional: NVIDIA GPU with CUDA, AMD GPU with ROCm, or Apple Silicon ### Detailed Setup 1. **Install Python dependencies**: ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt ``` 2. **Install Node.js dependencies**: ```bash npm install npm run build # Build TypeScript files ``` 3. **Configure GPU Support** (Optional): ```bash # For NVIDIA GPUs pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # For AMD GPUs (ROCm) pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 # For Apple Silicon pip install torch torchvision torchaudio ``` 4. **Set up Ollama**: ```bash # Install Ollama (https://ollama.ai) curl -fsSL https://ollama.ai/install.sh | sh # Pull required models ollama pull gemma2:9b # Faster, for streaming ollama pull gemma3:27b # Better quality ``` 5. **Configure TTS Server**: Ensure your OpenAI Edge TTS server is running. Default expected at `http://localhost:5050` ## Configuration ### Environment Variables Talk2Me uses encrypted secrets management for sensitive configuration. You can use either the secure secrets system or traditional environment variables. #### Using Secure Secrets Management (Recommended) ```bash # Initialize the secrets system python manage_secrets.py init # Set required secrets python manage_secrets.py set TTS_API_KEY python manage_secrets.py set TTS_SERVER_URL python manage_secrets.py set ADMIN_TOKEN # List all secrets python manage_secrets.py list # Rotate encryption keys python manage_secrets.py rotate ``` #### Using Environment Variables Create a `.env` file: ```env # Core Configuration TTS_API_KEY=your-api-key-here TTS_SERVER_URL=http://localhost:5050/v1/audio/speech ADMIN_TOKEN=your-secure-admin-token # CORS Configuration CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com ADMIN_CORS_ORIGINS=https://admin.yourdomain.com # Security Settings SECRET_KEY=your-secret-key-here MAX_CONTENT_LENGTH=52428800 # 50MB SESSION_LIFETIME=3600 # 1 hour RATE_LIMIT_STORAGE_URL=redis://localhost:6379/0 # Performance Tuning WHISPER_MODEL_SIZE=base GPU_MEMORY_THRESHOLD_MB=2048 MEMORY_CLEANUP_INTERVAL=30 ``` ### Advanced Configuration #### CORS Settings ```bash # Development (allow all origins) export CORS_ORIGINS="*" # Production (restrict to specific domains) export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com" export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com" ``` #### Rate Limiting Configure per-endpoint rate limits: ```python # In your config or via admin API RATE_LIMITS = { 'default': {'requests_per_minute': 30, 'requests_per_hour': 500}, 'transcribe': {'requests_per_minute': 10, 'requests_per_hour': 100}, 'translate': {'requests_per_minute': 20, 'requests_per_hour': 300} } ``` #### Session Management ```python SESSION_CONFIG = { 'max_file_size_mb': 100, 'max_files_per_session': 100, 'idle_timeout_minutes': 15, 'max_lifetime_minutes': 60 } ``` ## Security Features ### 1. Rate Limiting Comprehensive DoS protection with: - Token bucket algorithm with sliding window - Per-endpoint configurable limits - Automatic IP blocking for abusive clients - Request size validation ```bash # Check rate limit status curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/rate-limits # Block an IP curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"ip": "192.168.1.100", "duration": 3600}' \ http://localhost:5005/admin/block-ip ``` ### 2. Secrets Management - AES-128 encryption for sensitive data - Automatic key rotation - Audit logging - Platform-specific secure storage ```bash # View audit log python manage_secrets.py audit # Backup secrets python manage_secrets.py export --output backup.enc # Restore from backup python manage_secrets.py import --input backup.enc ``` ### 3. Session Management - Automatic resource tracking - Per-session limits (100 files, 100MB) - Idle session cleanup (15 minutes) - Real-time monitoring ```bash # View active sessions curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/sessions # Clean up specific session curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ http://localhost:5005/admin/sessions/SESSION_ID/cleanup ``` ### 4. Request Size Limits - Global limit: 50MB - Audio files: 25MB - JSON payloads: 1MB - Dynamic configuration ```bash # Update size limits curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"max_audio_size": "30MB"}' \ http://localhost:5005/admin/size-limits ``` ## Production Deployment ### Docker Deployment ```bash # Build and run with Docker Compose (CPU only) docker-compose up -d # With NVIDIA GPU support docker-compose -f docker-compose.yml -f docker-compose.nvidia.yml up -d # With AMD GPU support (ROCm) docker-compose -f docker-compose.yml -f docker-compose.amd.yml up -d # With Apple Silicon support docker-compose -f docker-compose.yml -f docker-compose.apple.yml up -d # Scale web workers docker-compose up -d --scale talk2me=4 # View logs docker-compose logs -f talk2me ``` ### Docker Compose Configuration Choose the appropriate configuration based on your GPU: #### NVIDIA GPU Configuration ```yaml version: '3.8' services: web: build: . ports: - "5005:5005" environment: - GUNICORN_WORKERS=4 - GUNICORN_THREADS=2 volumes: - ./logs:/app/logs - whisper-cache:/root/.cache/whisper deploy: resources: limits: memory: 4G reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] ``` #### AMD GPU Configuration (ROCm) ```yaml version: '3.8' services: web: build: . ports: - "5005:5005" environment: - GUNICORN_WORKERS=4 - GUNICORN_THREADS=2 - HSA_OVERRIDE_GFX_VERSION=10.3.0 # Adjust for your GPU volumes: - ./logs:/app/logs - whisper-cache:/root/.cache/whisper - /dev/kfd:/dev/kfd # ROCm KFD interface - /dev/dri:/dev/dri # Direct Rendering Interface devices: - /dev/kfd - /dev/dri group_add: - video - render deploy: resources: limits: memory: 4G ``` #### Apple Silicon Configuration ```yaml version: '3.8' services: web: build: . platform: linux/arm64/v8 # For M1/M2 Macs ports: - "5005:5005" environment: - GUNICORN_WORKERS=4 - GUNICORN_THREADS=2 - PYTORCH_ENABLE_MPS_FALLBACK=1 # Enable MPS fallback volumes: - ./logs:/app/logs - whisper-cache:/root/.cache/whisper deploy: resources: limits: memory: 4G ``` #### CPU-Only Configuration ```yaml version: '3.8' services: web: build: . ports: - "5005:5005" environment: - GUNICORN_WORKERS=4 - GUNICORN_THREADS=2 - OMP_NUM_THREADS=4 # OpenMP threads for CPU volumes: - ./logs:/app/logs - whisper-cache:/root/.cache/whisper deploy: resources: limits: memory: 4G cpus: '4.0' ``` ### Nginx Configuration ```nginx upstream talk2me { least_conn; server web1:5005 weight=1 max_fails=3 fail_timeout=30s; server web2:5005 weight=1 max_fails=3 fail_timeout=30s; } server { listen 443 ssl http2; server_name talk2me.yourdomain.com; ssl_certificate /etc/ssl/certs/talk2me.crt; ssl_certificate_key /etc/ssl/private/talk2me.key; client_max_body_size 50M; location / { proxy_pass http://talk2me; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $host; # WebSocket support proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } # Cache static assets location /static/ { alias /app/static/; expires 30d; add_header Cache-Control "public, immutable"; } } ``` ### Systemd Service ```ini [Unit] Description=Talk2Me Translation Service After=network.target [Service] Type=notify User=talk2me Group=talk2me WorkingDirectory=/opt/talk2me Environment="PATH=/opt/talk2me/venv/bin" ExecStart=/opt/talk2me/venv/bin/gunicorn \ --config gunicorn_config.py \ --bind 0.0.0.0:5005 \ app:app Restart=always RestartSec=10 [Install] WantedBy=multi-user.target ``` ## API Documentation ### Core Endpoints #### Transcribe Audio ```http POST /transcribe Content-Type: multipart/form-data audio: (binary) source_lang: auto|language_code ``` #### Translate Text ```http POST /translate Content-Type: application/json { "text": "Hello world", "source_lang": "English", "target_lang": "Spanish" } ``` #### Streaming Translation ```http POST /translate/stream Content-Type: application/json { "text": "Long text to translate", "source_lang": "auto", "target_lang": "French" } Response: Server-Sent Events stream ``` #### Text-to-Speech ```http POST /speak Content-Type: application/json { "text": "Hola mundo", "language": "Spanish" } ``` ### Admin Endpoints All admin endpoints require `X-Admin-Token` header. #### Health & Monitoring - `GET /health` - Basic health check - `GET /health/detailed` - Component status - `GET /metrics` - Prometheus metrics - `GET /admin/memory` - Memory usage stats #### Session Management - `GET /admin/sessions` - List active sessions - `GET /admin/sessions/:id` - Session details - `POST /admin/sessions/:id/cleanup` - Manual cleanup #### Security Controls - `GET /admin/rate-limits` - View rate limits - `POST /admin/block-ip` - Block IP address - `GET /admin/logs/security` - Security events ## Admin Dashboard Talk2Me includes a comprehensive admin analytics dashboard for monitoring and managing the application. ### Features - **Real-time Analytics**: Monitor requests, active sessions, and error rates - **Performance Metrics**: Track response times, throughput, and resource usage - **System Health**: Monitor Redis, PostgreSQL, and ML services status - **Language Analytics**: View popular language pairs and usage patterns - **Error Analysis**: Detailed error tracking with types and trends - **Data Export**: Download analytics data in JSON format ### Setup 1. **Initialize Database**: ```bash python init_analytics_db.py ``` 2. **Configure Admin Token**: ```bash export ADMIN_TOKEN="your-secure-admin-token" ``` 3. **Access Dashboard**: - Navigate to `https://yourdomain.com/admin` - Enter your admin token - View real-time analytics ### Dashboard Sections - **Overview Cards**: Key metrics at a glance - **Request Volume**: Visualize traffic patterns - **Operations**: Translation and transcription statistics - **Performance**: Response time percentiles (P95, P99) - **Error Tracking**: Error types and recent issues - **System Health**: Component status monitoring For detailed admin documentation, see [ADMIN_DASHBOARD.md](ADMIN_DASHBOARD.md). ## Development ### TypeScript Development ```bash # Install dependencies npm install # Development mode with auto-compilation npm run dev # Build for production npm run build # Type checking npm run typecheck ``` ### Project Structure ``` talk2me/ ├── app.py # Main Flask application ├── config.py # Configuration management ├── requirements.txt # Python dependencies ├── package.json # Node.js dependencies ├── tsconfig.json # TypeScript configuration ├── gunicorn_config.py # Production server config ├── docker-compose.yml # Container orchestration ├── static/ │ ├── js/ │ │ ├── src/ # TypeScript source files │ │ └── dist/ # Compiled JavaScript │ ├── css/ # Stylesheets │ └── icons/ # PWA icons ├── templates/ # HTML templates ├── logs/ # Application logs └── tests/ # Test suite ``` ### Key Components 1. **Connection Management** (`connectionManager.ts`) - Automatic retry with exponential backoff - Request queuing during offline periods - Connection status monitoring 2. **Translation Cache** (`translationCache.ts`) - IndexedDB for offline support - LRU eviction policy - Automatic cache size management 3. **Speaker Management** (`speakerManager.ts`) - Multi-speaker conversation tracking - Speaker-specific audio handling - Conversation export functionality 4. **Error Handling** (`errorBoundary.ts`) - Global error catching - Automatic error reporting - User-friendly error messages ### Running Tests ```bash # Python tests pytest tests/ -v # TypeScript tests npm test # Integration tests python test_integration.py ``` ## Monitoring & Operations ### Logging System Talk2Me uses structured JSON logging with multiple streams: ```bash logs/ ├── talk2me.log # General application log ├── errors.log # Error-specific log ├── access.log # HTTP access log ├── security.log # Security events └── performance.log # Performance metrics ``` View logs: ```bash # Recent errors tail -f logs/errors.log | jq '.' # Security events grep "rate_limit_exceeded" logs/security.log | jq '.' # Slow requests jq 'select(.extra_fields.duration_ms > 1000)' logs/performance.log ``` ### Memory Management Talk2Me includes comprehensive memory leak prevention: 1. **Backend Memory Management** - GPU memory monitoring - Automatic model reloading - Temporary file cleanup 2. **Frontend Memory Management** - Audio blob cleanup - WebRTC resource management - Event listener cleanup Monitor memory: ```bash # Check memory stats curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/memory # Trigger manual cleanup curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ http://localhost:5005/admin/memory/cleanup ``` ### Performance Tuning #### GPU Optimization ```python # config.py or environment GPU_OPTIMIZATIONS = { 'enabled': True, 'fp16': True, # Half precision for 2x speedup 'batch_size': 1, # Adjust based on GPU memory 'num_workers': 2, # Parallel data loading 'pin_memory': True # Faster GPU transfer } ``` #### Whisper Optimization ```python TRANSCRIBE_OPTIONS = { 'beam_size': 1, # Faster inference 'best_of': 1, # Disable multiple attempts 'temperature': 0, # Deterministic output 'compression_ratio_threshold': 2.4, 'logprob_threshold': -1.0, 'no_speech_threshold': 0.6 } ``` ### Scaling Considerations 1. **Horizontal Scaling** - Use Redis for shared rate limiting - Configure sticky sessions for WebSocket - Share audio files via object storage 2. **Vertical Scaling** - Increase worker processes - Tune thread pool size - Allocate more GPU memory 3. **Caching Strategy** - Cache translations in Redis - Use CDN for static assets - Enable HTTP caching headers ## Troubleshooting ### Common Issues #### GPU Not Detected ```bash # Check CUDA availability python -c "import torch; print(torch.cuda.is_available())" # Check GPU memory nvidia-smi # For AMD GPUs rocm-smi # For Apple Silicon python -c "import torch; print(torch.backends.mps.is_available())" ``` #### High Memory Usage ```bash # Check for memory leaks curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/health/storage # Manual cleanup curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \ http://localhost:5005/admin/cleanup ``` #### CORS Issues ```bash # Test CORS configuration curl -X OPTIONS http://localhost:5005/api/transcribe \ -H "Origin: https://yourdomain.com" \ -H "Access-Control-Request-Method: POST" ``` #### TTS Server Connection ```bash # Check TTS server status curl http://localhost:5005/check_tts_server # Update TTS configuration curl -X POST http://localhost:5005/update_tts_config \ -H "Content-Type: application/json" \ -d '{"server_url": "http://localhost:5050/v1/audio/speech", "api_key": "new-key"}' ``` ### Debug Mode Enable debug logging: ```bash export FLASK_ENV=development export LOG_LEVEL=DEBUG python app.py ``` ### Performance Profiling ```bash # Enable performance logging export ENABLE_PROFILING=true # View slow requests jq 'select(.duration_ms > 1000)' logs/performance.log ``` ## Contributing We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details. ### Development Setup 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Make your changes 4. Run tests (`pytest && npm test`) 5. Commit your changes (`git commit -m 'Add amazing feature'`) 6. Push to the branch (`git push origin feature/amazing-feature`) 7. Open a Pull Request ### Code Style - Python: Follow PEP 8 - TypeScript: Use ESLint configuration - Commit messages: Use conventional commits ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## Acknowledgments - OpenAI Whisper team for the amazing speech recognition model - Ollama team for making LLMs accessible - All contributors who have helped improve Talk2Me ## Support - **Documentation**: Full docs at [docs.talk2me.app](https://docs.talk2me.app) - **Issues**: [GitHub Issues](https://github.com/yourusername/talk2me/issues) - **Discussions**: [GitHub Discussions](https://github.com/yourusername/talk2me/discussions) - **Security**: Please report security vulnerabilities to security@talk2me.app