This adds a complete production deployment setup using Gunicorn as the WSGI server, replacing Flask's development server. Key components: - Gunicorn configuration with optimized worker settings - Support for sync, threaded, and async (gevent) workers - Automatic worker recycling to prevent memory leaks - Increased timeouts for audio processing - Production-ready logging and monitoring Deployment options: 1. Docker/Docker Compose for containerized deployment 2. Systemd service for traditional deployment 3. Nginx reverse proxy configuration 4. SSL/TLS support Production features: - wsgi.py entry point for WSGI servers - gunicorn_config.py with production settings - Dockerfile with multi-stage build - docker-compose.yml with full stack (Redis, PostgreSQL) - nginx.conf with caching and security headers - systemd service with security hardening - deploy.sh automated deployment script Configuration: - .env.production template with all settings - Support for environment-based configuration - Separate requirements-prod.txt - Prometheus metrics endpoint (/metrics) Monitoring: - Health check endpoints for liveness/readiness - Prometheus-compatible metrics - Structured logging - Memory usage tracking - Request counting Security: - Non-root user in Docker - Systemd security restrictions - Nginx security headers - File permission hardening - Resource limits Documentation: - Comprehensive PRODUCTION_DEPLOYMENT.md - Scaling strategies - Performance tuning guide - Troubleshooting section Also fixed memory_manager.py GC stats collection error. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
181 lines
5.1 KiB
Markdown
181 lines
5.1 KiB
Markdown
# Voice Language Translator
|
|
|
|
A mobile-friendly web application that translates spoken language between multiple languages using:
|
|
- Gemma 3 open-source LLM via Ollama for translation
|
|
- OpenAI Whisper for speech-to-text
|
|
- OpenAI Edge TTS for text-to-speech
|
|
|
|
## Supported Languages
|
|
|
|
- Arabic
|
|
- Armenian
|
|
- Azerbaijani
|
|
- English
|
|
- French
|
|
- Georgian
|
|
- Kazakh
|
|
- Mandarin
|
|
- Farsi
|
|
- Portuguese
|
|
- Russian
|
|
- Spanish
|
|
- Turkish
|
|
- Uzbek
|
|
|
|
## Setup Instructions
|
|
|
|
1. Install the required Python packages:
|
|
```
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
2. Configure secrets and environment:
|
|
```bash
|
|
# Initialize secure secrets management
|
|
python manage_secrets.py init
|
|
|
|
# Set required secrets
|
|
python manage_secrets.py set TTS_API_KEY
|
|
|
|
# Or use traditional .env file
|
|
cp .env.example .env
|
|
nano .env
|
|
```
|
|
|
|
**⚠️ Security Note**: Talk2Me includes encrypted secrets management. See [SECURITY.md](SECURITY.md) and [SECRETS_MANAGEMENT.md](SECRETS_MANAGEMENT.md) for details.
|
|
|
|
3. Make sure you have Ollama installed and the Gemma 3 model loaded:
|
|
```
|
|
ollama pull gemma3
|
|
```
|
|
|
|
4. Ensure your OpenAI Edge TTS server is running on port 5050.
|
|
|
|
5. Run the application:
|
|
```
|
|
python app.py
|
|
```
|
|
|
|
6. Open your browser and navigate to:
|
|
```
|
|
http://localhost:8000
|
|
```
|
|
|
|
## Usage
|
|
|
|
1. Select your source language from the dropdown menu
|
|
2. Press the microphone button and speak
|
|
3. Press the button again to stop recording
|
|
4. Wait for the transcription to complete
|
|
5. Select your target language
|
|
6. Press the "Translate" button
|
|
7. Use the play buttons to hear the original or translated text
|
|
|
|
## Technical Details
|
|
|
|
- The app uses Flask for the web server
|
|
- Audio is processed client-side using the MediaRecorder API
|
|
- Whisper for speech recognition with language hints
|
|
- Ollama provides access to the Gemma 3 model for translation
|
|
- OpenAI Edge TTS delivers natural-sounding speech output
|
|
|
|
## CORS Configuration
|
|
|
|
The application supports Cross-Origin Resource Sharing (CORS) for secure cross-origin usage. See [CORS_CONFIG.md](CORS_CONFIG.md) for detailed configuration instructions.
|
|
|
|
Quick setup:
|
|
```bash
|
|
# Development (allow all origins)
|
|
export CORS_ORIGINS="*"
|
|
|
|
# Production (restrict to specific domains)
|
|
export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
|
|
export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com"
|
|
```
|
|
|
|
## Connection Retry & Offline Support
|
|
|
|
Talk2Me handles network interruptions gracefully with automatic retry logic:
|
|
- Automatic request queuing during connection loss
|
|
- Exponential backoff retry with configurable parameters
|
|
- Visual connection status indicators
|
|
- Priority-based request processing
|
|
|
|
See [CONNECTION_RETRY.md](CONNECTION_RETRY.md) for detailed documentation.
|
|
|
|
## Rate Limiting
|
|
|
|
Comprehensive rate limiting protects against DoS attacks and resource exhaustion:
|
|
- Token bucket algorithm with sliding window
|
|
- Per-endpoint configurable limits
|
|
- Automatic IP blocking for abusive clients
|
|
- Global request limits and concurrent request throttling
|
|
- Request size validation
|
|
|
|
See [RATE_LIMITING.md](RATE_LIMITING.md) for detailed documentation.
|
|
|
|
## Session Management
|
|
|
|
Advanced session management prevents resource leaks from abandoned sessions:
|
|
- Automatic tracking of all session resources (audio files, temp files)
|
|
- Per-session resource limits (100 files, 100MB)
|
|
- Automatic cleanup of idle sessions (15 minutes) and expired sessions (1 hour)
|
|
- Real-time monitoring and metrics
|
|
- Manual cleanup capabilities for administrators
|
|
|
|
See [SESSION_MANAGEMENT.md](SESSION_MANAGEMENT.md) for detailed documentation.
|
|
|
|
## Request Size Limits
|
|
|
|
Comprehensive request size limiting prevents memory exhaustion:
|
|
- Global limit: 50MB for any request
|
|
- Audio files: 25MB maximum
|
|
- JSON payloads: 1MB maximum
|
|
- File type detection and enforcement
|
|
- Dynamic configuration via admin API
|
|
|
|
See [REQUEST_SIZE_LIMITS.md](REQUEST_SIZE_LIMITS.md) for detailed documentation.
|
|
|
|
## Error Logging
|
|
|
|
Production-ready error logging system for debugging and monitoring:
|
|
- Structured JSON logs for easy parsing
|
|
- Multiple log streams (app, errors, access, security, performance)
|
|
- Automatic log rotation to prevent disk exhaustion
|
|
- Request tracing with unique IDs
|
|
- Performance metrics and slow request tracking
|
|
- Admin endpoints for log analysis
|
|
|
|
See [ERROR_LOGGING.md](ERROR_LOGGING.md) for detailed documentation.
|
|
|
|
## Memory Management
|
|
|
|
Comprehensive memory leak prevention for extended use:
|
|
- GPU memory management with automatic cleanup
|
|
- Whisper model reloading to prevent fragmentation
|
|
- Frontend resource tracking (audio blobs, contexts, streams)
|
|
- Automatic cleanup of temporary files
|
|
- Memory monitoring and manual cleanup endpoints
|
|
|
|
See [MEMORY_MANAGEMENT.md](MEMORY_MANAGEMENT.md) for detailed documentation.
|
|
|
|
## Production Deployment
|
|
|
|
For production use, deploy with a proper WSGI server:
|
|
- Gunicorn with optimized worker configuration
|
|
- Nginx reverse proxy with caching
|
|
- Docker/Docker Compose support
|
|
- Systemd service management
|
|
- Comprehensive security hardening
|
|
|
|
Quick start:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
See [PRODUCTION_DEPLOYMENT.md](PRODUCTION_DEPLOYMENT.md) for detailed deployment instructions.
|
|
|
|
## Mobile Support
|
|
|
|
The interface is fully responsive and designed to work well on mobile devices.
|