talk2me/README.md
Adolfo Delorenzo fa951c3141 Add comprehensive database integration, authentication, and admin dashboard
This commit introduces major enhancements to Talk2Me:

## Database Integration
- PostgreSQL support with SQLAlchemy ORM
- Redis integration for caching and real-time analytics
- Automated database initialization scripts
- Migration support infrastructure

## User Authentication System
- JWT-based API authentication
- Session-based web authentication
- API key authentication for programmatic access
- User roles and permissions (admin/user)
- Login history and session tracking
- Rate limiting per user with customizable limits

## Admin Dashboard
- Real-time analytics and monitoring
- User management interface (create, edit, delete users)
- System health monitoring
- Request/error tracking
- Language pair usage statistics
- Performance metrics visualization

## Key Features
- Dual authentication support (token + user accounts)
- Graceful fallback for missing services
- Non-blocking analytics middleware
- Comprehensive error handling
- Session management with security features

## Bug Fixes
- Fixed rate limiting bypass for admin routes
- Added missing email validation method
- Improved error handling for missing database tables
- Fixed session-based authentication for API endpoints

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 18:21:56 -06:00

860 lines
20 KiB
Markdown

# Talk2Me - Real-Time Voice Language Translator
A production-ready, mobile-friendly web application that provides real-time translation of spoken language between multiple languages.
## Features
- **Real-time Speech Recognition**: Powered by OpenAI Whisper with GPU acceleration
- **Advanced Translation**: Using Gemma 3 open-source LLM via Ollama
- **Natural Text-to-Speech**: OpenAI Edge TTS for lifelike voice output
- **Progressive Web App**: Full offline support with service workers
- **Multi-Speaker Support**: Track and translate conversations with multiple participants
- **Enterprise Security**: Comprehensive rate limiting, session management, and encrypted secrets
- **Production Ready**: Docker support, load balancing, and extensive monitoring
- **Admin Dashboard**: Real-time analytics, performance monitoring, and system health tracking
## Table of Contents
- [Supported Languages](#supported-languages)
- [Quick Start](#quick-start)
- [Installation](#installation)
- [Configuration](#configuration)
- [Security Features](#security-features)
- [Production Deployment](#production-deployment)
- [API Documentation](#api-documentation)
- [Development](#development)
- [Monitoring & Operations](#monitoring--operations)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
## Supported Languages
- Arabic
- Armenian
- Azerbaijani
- English
- French
- Georgian
- Kazakh
- Mandarin
- Farsi
- Portuguese
- Russian
- Spanish
- Turkish
- Uzbek
## Quick Start
```bash
# Clone the repository
git clone https://github.com/yourusername/talk2me.git
cd talk2me
# Install dependencies
pip install -r requirements.txt
npm install
# Initialize secure configuration
python manage_secrets.py init
python manage_secrets.py set TTS_API_KEY your-api-key-here
# Ensure Ollama is running with Gemma
ollama pull gemma2:9b
ollama pull gemma3:27b
# Start the application
python app.py
```
Open your browser and navigate to `http://localhost:5005`
## Installation
### Prerequisites
- Python 3.8+
- Node.js 14+
- Ollama (for LLM translation)
- OpenAI Edge TTS server
- Optional: NVIDIA GPU with CUDA, AMD GPU with ROCm, or Apple Silicon
### Detailed Setup
1. **Install Python dependencies**:
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
2. **Install Node.js dependencies**:
```bash
npm install
npm run build # Build TypeScript files
```
3. **Configure GPU Support** (Optional):
```bash
# For NVIDIA GPUs
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# For AMD GPUs (ROCm)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
# For Apple Silicon
pip install torch torchvision torchaudio
```
4. **Set up Ollama**:
```bash
# Install Ollama (https://ollama.ai)
curl -fsSL https://ollama.ai/install.sh | sh
# Pull required models
ollama pull gemma2:9b # Faster, for streaming
ollama pull gemma3:27b # Better quality
```
5. **Configure TTS Server**:
Ensure your OpenAI Edge TTS server is running. Default expected at `http://localhost:5050`
## Configuration
### Environment Variables
Talk2Me uses encrypted secrets management for sensitive configuration. You can use either the secure secrets system or traditional environment variables.
#### Using Secure Secrets Management (Recommended)
```bash
# Initialize the secrets system
python manage_secrets.py init
# Set required secrets
python manage_secrets.py set TTS_API_KEY
python manage_secrets.py set TTS_SERVER_URL
python manage_secrets.py set ADMIN_TOKEN
# List all secrets
python manage_secrets.py list
# Rotate encryption keys
python manage_secrets.py rotate
```
#### Using Environment Variables
Create a `.env` file:
```env
# Core Configuration
TTS_API_KEY=your-api-key-here
TTS_SERVER_URL=http://localhost:5050/v1/audio/speech
ADMIN_TOKEN=your-secure-admin-token
# CORS Configuration
CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com
ADMIN_CORS_ORIGINS=https://admin.yourdomain.com
# Security Settings
SECRET_KEY=your-secret-key-here
MAX_CONTENT_LENGTH=52428800 # 50MB
SESSION_LIFETIME=3600 # 1 hour
RATE_LIMIT_STORAGE_URL=redis://localhost:6379/0
# Performance Tuning
WHISPER_MODEL_SIZE=base
GPU_MEMORY_THRESHOLD_MB=2048
MEMORY_CLEANUP_INTERVAL=30
```
### Advanced Configuration
#### CORS Settings
```bash
# Development (allow all origins)
export CORS_ORIGINS="*"
# Production (restrict to specific domains)
export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com"
```
#### Rate Limiting
Configure per-endpoint rate limits:
```python
# In your config or via admin API
RATE_LIMITS = {
'default': {'requests_per_minute': 30, 'requests_per_hour': 500},
'transcribe': {'requests_per_minute': 10, 'requests_per_hour': 100},
'translate': {'requests_per_minute': 20, 'requests_per_hour': 300}
}
```
#### Session Management
```python
SESSION_CONFIG = {
'max_file_size_mb': 100,
'max_files_per_session': 100,
'idle_timeout_minutes': 15,
'max_lifetime_minutes': 60
}
```
## Security Features
### 1. Rate Limiting
Comprehensive DoS protection with:
- Token bucket algorithm with sliding window
- Per-endpoint configurable limits
- Automatic IP blocking for abusive clients
- Request size validation
```bash
# Check rate limit status
curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/rate-limits
# Block an IP
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"ip": "192.168.1.100", "duration": 3600}' \
http://localhost:5005/admin/block-ip
```
### 2. Secrets Management
- AES-128 encryption for sensitive data
- Automatic key rotation
- Audit logging
- Platform-specific secure storage
```bash
# View audit log
python manage_secrets.py audit
# Backup secrets
python manage_secrets.py export --output backup.enc
# Restore from backup
python manage_secrets.py import --input backup.enc
```
### 3. Session Management
- Automatic resource tracking
- Per-session limits (100 files, 100MB)
- Idle session cleanup (15 minutes)
- Real-time monitoring
```bash
# View active sessions
curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/sessions
# Clean up specific session
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
http://localhost:5005/admin/sessions/SESSION_ID/cleanup
```
### 4. Request Size Limits
- Global limit: 50MB
- Audio files: 25MB
- JSON payloads: 1MB
- Dynamic configuration
```bash
# Update size limits
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"max_audio_size": "30MB"}' \
http://localhost:5005/admin/size-limits
```
## Production Deployment
### Docker Deployment
```bash
# Build and run with Docker Compose (CPU only)
docker-compose up -d
# With NVIDIA GPU support
docker-compose -f docker-compose.yml -f docker-compose.nvidia.yml up -d
# With AMD GPU support (ROCm)
docker-compose -f docker-compose.yml -f docker-compose.amd.yml up -d
# With Apple Silicon support
docker-compose -f docker-compose.yml -f docker-compose.apple.yml up -d
# Scale web workers
docker-compose up -d --scale talk2me=4
# View logs
docker-compose logs -f talk2me
```
### Docker Compose Configuration
Choose the appropriate configuration based on your GPU:
#### NVIDIA GPU Configuration
```yaml
version: '3.8'
services:
web:
build: .
ports:
- "5005:5005"
environment:
- GUNICORN_WORKERS=4
- GUNICORN_THREADS=2
volumes:
- ./logs:/app/logs
- whisper-cache:/root/.cache/whisper
deploy:
resources:
limits:
memory: 4G
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
```
#### AMD GPU Configuration (ROCm)
```yaml
version: '3.8'
services:
web:
build: .
ports:
- "5005:5005"
environment:
- GUNICORN_WORKERS=4
- GUNICORN_THREADS=2
- HSA_OVERRIDE_GFX_VERSION=10.3.0 # Adjust for your GPU
volumes:
- ./logs:/app/logs
- whisper-cache:/root/.cache/whisper
- /dev/kfd:/dev/kfd # ROCm KFD interface
- /dev/dri:/dev/dri # Direct Rendering Interface
devices:
- /dev/kfd
- /dev/dri
group_add:
- video
- render
deploy:
resources:
limits:
memory: 4G
```
#### Apple Silicon Configuration
```yaml
version: '3.8'
services:
web:
build: .
platform: linux/arm64/v8 # For M1/M2 Macs
ports:
- "5005:5005"
environment:
- GUNICORN_WORKERS=4
- GUNICORN_THREADS=2
- PYTORCH_ENABLE_MPS_FALLBACK=1 # Enable MPS fallback
volumes:
- ./logs:/app/logs
- whisper-cache:/root/.cache/whisper
deploy:
resources:
limits:
memory: 4G
```
#### CPU-Only Configuration
```yaml
version: '3.8'
services:
web:
build: .
ports:
- "5005:5005"
environment:
- GUNICORN_WORKERS=4
- GUNICORN_THREADS=2
- OMP_NUM_THREADS=4 # OpenMP threads for CPU
volumes:
- ./logs:/app/logs
- whisper-cache:/root/.cache/whisper
deploy:
resources:
limits:
memory: 4G
cpus: '4.0'
```
### Nginx Configuration
```nginx
upstream talk2me {
least_conn;
server web1:5005 weight=1 max_fails=3 fail_timeout=30s;
server web2:5005 weight=1 max_fails=3 fail_timeout=30s;
}
server {
listen 443 ssl http2;
server_name talk2me.yourdomain.com;
ssl_certificate /etc/ssl/certs/talk2me.crt;
ssl_certificate_key /etc/ssl/private/talk2me.key;
client_max_body_size 50M;
location / {
proxy_pass http://talk2me;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $host;
# WebSocket support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
# Cache static assets
location /static/ {
alias /app/static/;
expires 30d;
add_header Cache-Control "public, immutable";
}
}
```
### Systemd Service
```ini
[Unit]
Description=Talk2Me Translation Service
After=network.target
[Service]
Type=notify
User=talk2me
Group=talk2me
WorkingDirectory=/opt/talk2me
Environment="PATH=/opt/talk2me/venv/bin"
ExecStart=/opt/talk2me/venv/bin/gunicorn \
--config gunicorn_config.py \
--bind 0.0.0.0:5005 \
app:app
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
## API Documentation
### Core Endpoints
#### Transcribe Audio
```http
POST /transcribe
Content-Type: multipart/form-data
audio: (binary)
source_lang: auto|language_code
```
#### Translate Text
```http
POST /translate
Content-Type: application/json
{
"text": "Hello world",
"source_lang": "English",
"target_lang": "Spanish"
}
```
#### Streaming Translation
```http
POST /translate/stream
Content-Type: application/json
{
"text": "Long text to translate",
"source_lang": "auto",
"target_lang": "French"
}
Response: Server-Sent Events stream
```
#### Text-to-Speech
```http
POST /speak
Content-Type: application/json
{
"text": "Hola mundo",
"language": "Spanish"
}
```
### Admin Endpoints
All admin endpoints require `X-Admin-Token` header.
#### Health & Monitoring
- `GET /health` - Basic health check
- `GET /health/detailed` - Component status
- `GET /metrics` - Prometheus metrics
- `GET /admin/memory` - Memory usage stats
#### Session Management
- `GET /admin/sessions` - List active sessions
- `GET /admin/sessions/:id` - Session details
- `POST /admin/sessions/:id/cleanup` - Manual cleanup
#### Security Controls
- `GET /admin/rate-limits` - View rate limits
- `POST /admin/block-ip` - Block IP address
- `GET /admin/logs/security` - Security events
## Admin Dashboard
Talk2Me includes a comprehensive admin analytics dashboard for monitoring and managing the application.
### Features
- **Real-time Analytics**: Monitor requests, active sessions, and error rates
- **Performance Metrics**: Track response times, throughput, and resource usage
- **System Health**: Monitor Redis, PostgreSQL, and ML services status
- **Language Analytics**: View popular language pairs and usage patterns
- **Error Analysis**: Detailed error tracking with types and trends
- **Data Export**: Download analytics data in JSON format
### Setup
1. **Initialize Database**:
```bash
python init_analytics_db.py
```
2. **Configure Admin Token**:
```bash
export ADMIN_TOKEN="your-secure-admin-token"
```
3. **Access Dashboard**:
- Navigate to `https://yourdomain.com/admin`
- Enter your admin token
- View real-time analytics
### Dashboard Sections
- **Overview Cards**: Key metrics at a glance
- **Request Volume**: Visualize traffic patterns
- **Operations**: Translation and transcription statistics
- **Performance**: Response time percentiles (P95, P99)
- **Error Tracking**: Error types and recent issues
- **System Health**: Component status monitoring
For detailed admin documentation, see [ADMIN_DASHBOARD.md](ADMIN_DASHBOARD.md).
## Development
### TypeScript Development
```bash
# Install dependencies
npm install
# Development mode with auto-compilation
npm run dev
# Build for production
npm run build
# Type checking
npm run typecheck
```
### Project Structure
```
talk2me/
├── app.py # Main Flask application
├── config.py # Configuration management
├── requirements.txt # Python dependencies
├── package.json # Node.js dependencies
├── tsconfig.json # TypeScript configuration
├── gunicorn_config.py # Production server config
├── docker-compose.yml # Container orchestration
├── static/
│ ├── js/
│ │ ├── src/ # TypeScript source files
│ │ └── dist/ # Compiled JavaScript
│ ├── css/ # Stylesheets
│ └── icons/ # PWA icons
├── templates/ # HTML templates
├── logs/ # Application logs
└── tests/ # Test suite
```
### Key Components
1. **Connection Management** (`connectionManager.ts`)
- Automatic retry with exponential backoff
- Request queuing during offline periods
- Connection status monitoring
2. **Translation Cache** (`translationCache.ts`)
- IndexedDB for offline support
- LRU eviction policy
- Automatic cache size management
3. **Speaker Management** (`speakerManager.ts`)
- Multi-speaker conversation tracking
- Speaker-specific audio handling
- Conversation export functionality
4. **Error Handling** (`errorBoundary.ts`)
- Global error catching
- Automatic error reporting
- User-friendly error messages
### Running Tests
```bash
# Python tests
pytest tests/ -v
# TypeScript tests
npm test
# Integration tests
python test_integration.py
```
## Monitoring & Operations
### Logging System
Talk2Me uses structured JSON logging with multiple streams:
```bash
logs/
├── talk2me.log # General application log
├── errors.log # Error-specific log
├── access.log # HTTP access log
├── security.log # Security events
└── performance.log # Performance metrics
```
View logs:
```bash
# Recent errors
tail -f logs/errors.log | jq '.'
# Security events
grep "rate_limit_exceeded" logs/security.log | jq '.'
# Slow requests
jq 'select(.extra_fields.duration_ms > 1000)' logs/performance.log
```
### Memory Management
Talk2Me includes comprehensive memory leak prevention:
1. **Backend Memory Management**
- GPU memory monitoring
- Automatic model reloading
- Temporary file cleanup
2. **Frontend Memory Management**
- Audio blob cleanup
- WebRTC resource management
- Event listener cleanup
Monitor memory:
```bash
# Check memory stats
curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/admin/memory
# Trigger manual cleanup
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
http://localhost:5005/admin/memory/cleanup
```
### Performance Tuning
#### GPU Optimization
```python
# config.py or environment
GPU_OPTIMIZATIONS = {
'enabled': True,
'fp16': True, # Half precision for 2x speedup
'batch_size': 1, # Adjust based on GPU memory
'num_workers': 2, # Parallel data loading
'pin_memory': True # Faster GPU transfer
}
```
#### Whisper Optimization
```python
TRANSCRIBE_OPTIONS = {
'beam_size': 1, # Faster inference
'best_of': 1, # Disable multiple attempts
'temperature': 0, # Deterministic output
'compression_ratio_threshold': 2.4,
'logprob_threshold': -1.0,
'no_speech_threshold': 0.6
}
```
### Scaling Considerations
1. **Horizontal Scaling**
- Use Redis for shared rate limiting
- Configure sticky sessions for WebSocket
- Share audio files via object storage
2. **Vertical Scaling**
- Increase worker processes
- Tune thread pool size
- Allocate more GPU memory
3. **Caching Strategy**
- Cache translations in Redis
- Use CDN for static assets
- Enable HTTP caching headers
## Troubleshooting
### Common Issues
#### GPU Not Detected
```bash
# Check CUDA availability
python -c "import torch; print(torch.cuda.is_available())"
# Check GPU memory
nvidia-smi
# For AMD GPUs
rocm-smi
# For Apple Silicon
python -c "import torch; print(torch.backends.mps.is_available())"
```
#### High Memory Usage
```bash
# Check for memory leaks
curl -H "X-Admin-Token: $ADMIN_TOKEN" http://localhost:5005/health/storage
# Manual cleanup
curl -X POST -H "X-Admin-Token: $ADMIN_TOKEN" \
http://localhost:5005/admin/cleanup
```
#### CORS Issues
```bash
# Test CORS configuration
curl -X OPTIONS http://localhost:5005/api/transcribe \
-H "Origin: https://yourdomain.com" \
-H "Access-Control-Request-Method: POST"
```
#### TTS Server Connection
```bash
# Check TTS server status
curl http://localhost:5005/check_tts_server
# Update TTS configuration
curl -X POST http://localhost:5005/update_tts_config \
-H "Content-Type: application/json" \
-d '{"server_url": "http://localhost:5050/v1/audio/speech", "api_key": "new-key"}'
```
### Debug Mode
Enable debug logging:
```bash
export FLASK_ENV=development
export LOG_LEVEL=DEBUG
python app.py
```
### Performance Profiling
```bash
# Enable performance logging
export ENABLE_PROFILING=true
# View slow requests
jq 'select(.duration_ms > 1000)' logs/performance.log
```
## Contributing
We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.
### Development Setup
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests (`pytest && npm test`)
5. Commit your changes (`git commit -m 'Add amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request
### Code Style
- Python: Follow PEP 8
- TypeScript: Use ESLint configuration
- Commit messages: Use conventional commits
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- OpenAI Whisper team for the amazing speech recognition model
- Ollama team for making LLMs accessible
- All contributors who have helped improve Talk2Me
## Support
- **Documentation**: Full docs at [docs.talk2me.app](https://docs.talk2me.app)
- **Issues**: [GitHub Issues](https://github.com/yourusername/talk2me/issues)
- **Discussions**: [GitHub Discussions](https://github.com/yourusername/talk2me/discussions)
- **Security**: Please report security vulnerabilities to security@talk2me.app