talk2me/DATABASE_INTEGRATION.md
Adolfo Delorenzo fa951c3141 Add comprehensive database integration, authentication, and admin dashboard
This commit introduces major enhancements to Talk2Me:

## Database Integration
- PostgreSQL support with SQLAlchemy ORM
- Redis integration for caching and real-time analytics
- Automated database initialization scripts
- Migration support infrastructure

## User Authentication System
- JWT-based API authentication
- Session-based web authentication
- API key authentication for programmatic access
- User roles and permissions (admin/user)
- Login history and session tracking
- Rate limiting per user with customizable limits

## Admin Dashboard
- Real-time analytics and monitoring
- User management interface (create, edit, delete users)
- System health monitoring
- Request/error tracking
- Language pair usage statistics
- Performance metrics visualization

## Key Features
- Dual authentication support (token + user accounts)
- Graceful fallback for missing services
- Non-blocking analytics middleware
- Comprehensive error handling
- Session management with security features

## Bug Fixes
- Fixed rate limiting bypass for admin routes
- Added missing email validation method
- Improved error handling for missing database tables
- Fixed session-based authentication for API endpoints

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 18:21:56 -06:00

302 lines
6.2 KiB
Markdown

# Database Integration Guide
This guide explains the Redis and PostgreSQL integration for the Talk2Me application.
## Overview
The Talk2Me application now uses:
- **PostgreSQL**: For persistent storage of translations, transcriptions, user preferences, and analytics
- **Redis**: For caching, session management, and rate limiting
## Architecture
### PostgreSQL Database Schema
1. **translations** - Stores translation history
- Source and target text
- Languages
- Translation time and model used
- Session and user tracking
2. **transcriptions** - Stores transcription history
- Transcribed text
- Detected language
- Audio metadata
- Performance metrics
3. **user_preferences** - Stores user settings
- Preferred languages
- Voice preferences
- Usage statistics
4. **usage_analytics** - Aggregated analytics
- Hourly and daily metrics
- Service performance
- Language pair statistics
5. **api_keys** - API key management
- Rate limits
- Permissions
- Usage tracking
### Redis Usage
1. **Translation Cache**
- Key: `translation:{source_lang}:{target_lang}:{text_hash}`
- Expires: 24 hours
- Reduces API calls to Ollama
2. **Session Management**
- Key: `session:{session_id}`
- Stores session data and resources
- Expires: 1 hour (configurable)
3. **Rate Limiting**
- Token bucket implementation
- Per-client and global limits
- Sliding window tracking
4. **Push Subscriptions**
- Set: `push_subscriptions`
- Individual subscriptions: `push_subscription:{id}`
## Setup Instructions
### Prerequisites
1. Install PostgreSQL:
```bash
# Ubuntu/Debian
sudo apt-get install postgresql postgresql-contrib
# MacOS
brew install postgresql
```
2. Install Redis:
```bash
# Ubuntu/Debian
sudo apt-get install redis-server
# MacOS
brew install redis
```
3. Install Python dependencies:
```bash
pip install -r requirements.txt
```
### Quick Setup
Run the setup script:
```bash
./setup_databases.sh
```
### Manual Setup
1. Create PostgreSQL database:
```bash
createdb talk2me
```
2. Start Redis:
```bash
redis-server
```
3. Create .env file with database URLs:
```env
DATABASE_URL=postgresql://username@localhost/talk2me
REDIS_URL=redis://localhost:6379/0
```
4. Initialize database:
```bash
python database_init.py
```
5. Run migrations:
```bash
python migrations.py init
python migrations.py create "Initial migration"
python migrations.py run
```
## Configuration
### Environment Variables
```env
# PostgreSQL
DATABASE_URL=postgresql://username:password@host:port/database
SQLALCHEMY_DATABASE_URI=${DATABASE_URL}
SQLALCHEMY_ENGINE_OPTIONS_POOL_SIZE=10
SQLALCHEMY_ENGINE_OPTIONS_POOL_RECYCLE=3600
# Redis
REDIS_URL=redis://localhost:6379/0
REDIS_DECODE_RESPONSES=false
REDIS_MAX_CONNECTIONS=50
REDIS_SOCKET_TIMEOUT=5
# Session Management
MAX_SESSION_DURATION=3600
MAX_SESSION_IDLE_TIME=900
MAX_RESOURCES_PER_SESSION=100
MAX_BYTES_PER_SESSION=104857600
```
## Migration from In-Memory to Database
### What Changed
1. **Rate Limiting**
- Before: In-memory dictionaries
- After: Redis sorted sets and hashes
2. **Session Management**
- Before: In-memory session storage
- After: Redis with automatic expiration
3. **Translation Cache**
- Before: Client-side IndexedDB only
- After: Server-side Redis cache + client cache
4. **Analytics**
- Before: No persistent analytics
- After: PostgreSQL aggregated metrics
### Migration Steps
1. Backup current app.py:
```bash
cp app.py app_backup.py
```
2. Use the new app with database support:
```bash
cp app_with_db.py app.py
```
3. Update any custom configurations in the new app.py
## API Changes
### New Endpoints
- `/api/history/translations` - Get translation history
- `/api/history/transcriptions` - Get transcription history
- `/api/preferences` - Get/update user preferences
- `/api/analytics` - Get usage analytics
### Enhanced Features
1. **Translation Caching**
- Automatic server-side caching
- Reduced response time for repeated translations
2. **Session Persistence**
- Sessions survive server restarts
- Better resource tracking
3. **Improved Rate Limiting**
- Distributed rate limiting across multiple servers
- More accurate tracking
## Performance Considerations
1. **Database Indexes**
- Indexes on session_id, user_id, languages
- Composite indexes for common queries
2. **Redis Memory Usage**
- Monitor with: `redis-cli info memory`
- Configure maxmemory policy
3. **Connection Pooling**
- PostgreSQL: 10 connections default
- Redis: 50 connections default
## Monitoring
### PostgreSQL
```sql
-- Check database size
SELECT pg_database_size('talk2me');
-- Active connections
SELECT count(*) FROM pg_stat_activity;
-- Slow queries
SELECT * FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;
```
### Redis
```bash
# Memory usage
redis-cli info memory
# Connected clients
redis-cli info clients
# Monitor commands
redis-cli monitor
```
## Troubleshooting
### Common Issues
1. **PostgreSQL Connection Failed**
- Check if PostgreSQL is running: `sudo systemctl status postgresql`
- Verify DATABASE_URL in .env
- Check pg_hba.conf for authentication
2. **Redis Connection Failed**
- Check if Redis is running: `redis-cli ping`
- Verify REDIS_URL in .env
- Check Redis logs: `sudo journalctl -u redis`
3. **Migration Errors**
- Drop and recreate database if needed
- Check migration files in `migrations/`
- Run `python migrations.py init` to reinitialize
## Backup and Restore
### PostgreSQL Backup
```bash
# Backup
pg_dump talk2me > talk2me_backup.sql
# Restore
psql talk2me < talk2me_backup.sql
```
### Redis Backup
```bash
# Backup (if persistence enabled)
redis-cli BGSAVE
# Copy dump.rdb file
cp /var/lib/redis/dump.rdb redis_backup.rdb
```
## Security Notes
1. **Database Credentials**
- Never commit .env file
- Use strong passwords
- Limit database user permissions
2. **Redis Security**
- Consider enabling Redis AUTH
- Bind to localhost only
- Use SSL for remote connections
3. **Data Privacy**
- Translations/transcriptions contain user data
- Implement data retention policies
- Consider encryption at rest