talk2me/DATABASE_INTEGRATION.md
Adolfo Delorenzo fa951c3141 Add comprehensive database integration, authentication, and admin dashboard
This commit introduces major enhancements to Talk2Me:

## Database Integration
- PostgreSQL support with SQLAlchemy ORM
- Redis integration for caching and real-time analytics
- Automated database initialization scripts
- Migration support infrastructure

## User Authentication System
- JWT-based API authentication
- Session-based web authentication
- API key authentication for programmatic access
- User roles and permissions (admin/user)
- Login history and session tracking
- Rate limiting per user with customizable limits

## Admin Dashboard
- Real-time analytics and monitoring
- User management interface (create, edit, delete users)
- System health monitoring
- Request/error tracking
- Language pair usage statistics
- Performance metrics visualization

## Key Features
- Dual authentication support (token + user accounts)
- Graceful fallback for missing services
- Non-blocking analytics middleware
- Comprehensive error handling
- Session management with security features

## Bug Fixes
- Fixed rate limiting bypass for admin routes
- Added missing email validation method
- Improved error handling for missing database tables
- Fixed session-based authentication for API endpoints

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 18:21:56 -06:00

6.2 KiB

Database Integration Guide

This guide explains the Redis and PostgreSQL integration for the Talk2Me application.

Overview

The Talk2Me application now uses:

  • PostgreSQL: For persistent storage of translations, transcriptions, user preferences, and analytics
  • Redis: For caching, session management, and rate limiting

Architecture

PostgreSQL Database Schema

  1. translations - Stores translation history

    • Source and target text
    • Languages
    • Translation time and model used
    • Session and user tracking
  2. transcriptions - Stores transcription history

    • Transcribed text
    • Detected language
    • Audio metadata
    • Performance metrics
  3. user_preferences - Stores user settings

    • Preferred languages
    • Voice preferences
    • Usage statistics
  4. usage_analytics - Aggregated analytics

    • Hourly and daily metrics
    • Service performance
    • Language pair statistics
  5. api_keys - API key management

    • Rate limits
    • Permissions
    • Usage tracking

Redis Usage

  1. Translation Cache

    • Key: translation:{source_lang}:{target_lang}:{text_hash}
    • Expires: 24 hours
    • Reduces API calls to Ollama
  2. Session Management

    • Key: session:{session_id}
    • Stores session data and resources
    • Expires: 1 hour (configurable)
  3. Rate Limiting

    • Token bucket implementation
    • Per-client and global limits
    • Sliding window tracking
  4. Push Subscriptions

    • Set: push_subscriptions
    • Individual subscriptions: push_subscription:{id}

Setup Instructions

Prerequisites

  1. Install PostgreSQL:

    # Ubuntu/Debian
    sudo apt-get install postgresql postgresql-contrib
    
    # MacOS
    brew install postgresql
    
  2. Install Redis:

    # Ubuntu/Debian
    sudo apt-get install redis-server
    
    # MacOS
    brew install redis
    
  3. Install Python dependencies:

    pip install -r requirements.txt
    

Quick Setup

Run the setup script:

./setup_databases.sh

Manual Setup

  1. Create PostgreSQL database:

    createdb talk2me
    
  2. Start Redis:

    redis-server
    
  3. Create .env file with database URLs:

    DATABASE_URL=postgresql://username@localhost/talk2me
    REDIS_URL=redis://localhost:6379/0
    
  4. Initialize database:

    python database_init.py
    
  5. Run migrations:

    python migrations.py init
    python migrations.py create "Initial migration"
    python migrations.py run
    

Configuration

Environment Variables

# PostgreSQL
DATABASE_URL=postgresql://username:password@host:port/database
SQLALCHEMY_DATABASE_URI=${DATABASE_URL}
SQLALCHEMY_ENGINE_OPTIONS_POOL_SIZE=10
SQLALCHEMY_ENGINE_OPTIONS_POOL_RECYCLE=3600

# Redis
REDIS_URL=redis://localhost:6379/0
REDIS_DECODE_RESPONSES=false
REDIS_MAX_CONNECTIONS=50
REDIS_SOCKET_TIMEOUT=5

# Session Management
MAX_SESSION_DURATION=3600
MAX_SESSION_IDLE_TIME=900
MAX_RESOURCES_PER_SESSION=100
MAX_BYTES_PER_SESSION=104857600

Migration from In-Memory to Database

What Changed

  1. Rate Limiting

    • Before: In-memory dictionaries
    • After: Redis sorted sets and hashes
  2. Session Management

    • Before: In-memory session storage
    • After: Redis with automatic expiration
  3. Translation Cache

    • Before: Client-side IndexedDB only
    • After: Server-side Redis cache + client cache
  4. Analytics

    • Before: No persistent analytics
    • After: PostgreSQL aggregated metrics

Migration Steps

  1. Backup current app.py:

    cp app.py app_backup.py
    
  2. Use the new app with database support:

    cp app_with_db.py app.py
    
  3. Update any custom configurations in the new app.py

API Changes

New Endpoints

  • /api/history/translations - Get translation history
  • /api/history/transcriptions - Get transcription history
  • /api/preferences - Get/update user preferences
  • /api/analytics - Get usage analytics

Enhanced Features

  1. Translation Caching

    • Automatic server-side caching
    • Reduced response time for repeated translations
  2. Session Persistence

    • Sessions survive server restarts
    • Better resource tracking
  3. Improved Rate Limiting

    • Distributed rate limiting across multiple servers
    • More accurate tracking

Performance Considerations

  1. Database Indexes

    • Indexes on session_id, user_id, languages
    • Composite indexes for common queries
  2. Redis Memory Usage

    • Monitor with: redis-cli info memory
    • Configure maxmemory policy
  3. Connection Pooling

    • PostgreSQL: 10 connections default
    • Redis: 50 connections default

Monitoring

PostgreSQL

-- Check database size
SELECT pg_database_size('talk2me');

-- Active connections
SELECT count(*) FROM pg_stat_activity;

-- Slow queries
SELECT * FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;

Redis

# Memory usage
redis-cli info memory

# Connected clients
redis-cli info clients

# Monitor commands
redis-cli monitor

Troubleshooting

Common Issues

  1. PostgreSQL Connection Failed

    • Check if PostgreSQL is running: sudo systemctl status postgresql
    • Verify DATABASE_URL in .env
    • Check pg_hba.conf for authentication
  2. Redis Connection Failed

    • Check if Redis is running: redis-cli ping
    • Verify REDIS_URL in .env
    • Check Redis logs: sudo journalctl -u redis
  3. Migration Errors

    • Drop and recreate database if needed
    • Check migration files in migrations/
    • Run python migrations.py init to reinitialize

Backup and Restore

PostgreSQL Backup

# Backup
pg_dump talk2me > talk2me_backup.sql

# Restore
psql talk2me < talk2me_backup.sql

Redis Backup

# Backup (if persistence enabled)
redis-cli BGSAVE

# Copy dump.rdb file
cp /var/lib/redis/dump.rdb redis_backup.rdb

Security Notes

  1. Database Credentials

    • Never commit .env file
    • Use strong passwords
    • Limit database user permissions
  2. Redis Security

    • Consider enabling Redis AUTH
    • Bind to localhost only
    • Use SSL for remote connections
  3. Data Privacy

    • Translations/transcriptions contain user data
    • Implement data retention policies
    • Consider encryption at rest