Add production WSGI server - Flask dev server unsuitable for production load

This adds a complete production deployment setup using Gunicorn as the WSGI server, replacing Flask's development server. Key components: - Gunicorn configuration with optimized worker settings - Support for sync, threaded, and async (gevent) workers - Automatic worker recycling to prevent memory leaks - Increased timeouts for audio processing - Production-ready logging and monitoring Deployment options: 1. Docker/Docker Compose for containerized deployment 2. Systemd service for traditional deployment 3. Nginx reverse proxy configuration 4. SSL/TLS support Production features: - wsgi.py entry point for WSGI servers - gunicorn_config.py with production settings - Dockerfile with multi-stage build - docker-compose.yml with full stack (Redis, PostgreSQL) - nginx.conf with caching and security headers - systemd service with security hardening - deploy.sh automated deployment script Configuration: - .env.production template with all settings - Support for environment-based configuration - Separate requirements-prod.txt - Prometheus metrics endpoint (/metrics) Monitoring: - Health check endpoints for liveness/readiness - Prometheus-compatible metrics - Structured logging - Memory usage tracking - Request counting Security: - Non-root user in Docker - Systemd security restrictions - Nginx security headers - File permission hardening - Resource limits Documentation: - Comprehensive PRODUCTION_DEPLOYMENT.md - Scaling strategies - Performance tuning guide - Troubleshooting section Also fixed memory_manager.py GC stats collection error. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-03 08:49:32 -06:00
parent 1b9ad03400
commit 92fd390866
13 changed files with 1237 additions and 2 deletions
--- a/talk2me.service
+++ b/talk2me.service
@@ -0,0 +1,66 @@
+[Unit]
+Description=Talk2Me Real-time Translation Service
+Documentation=https://github.com/your-repo/talk2me
+After=network.target
+
+[Service]
+Type=notify
+User=talk2me
+Group=talk2me
+WorkingDirectory=/opt/talk2me
+Environment="PATH=/opt/talk2me/venv/bin"
+Environment="FLASK_ENV=production"
+Environment="PYTHONUNBUFFERED=1"
+
+# Production environment variables
+EnvironmentFile=-/opt/talk2me/.env
+
+# Gunicorn command with production settings
+ExecStart=/opt/talk2me/venv/bin/gunicorn \
+    --config /opt/talk2me/gunicorn_config.py \
+    --error-logfile /var/log/talk2me/gunicorn-error.log \
+    --access-logfile /var/log/talk2me/gunicorn-access.log \
+    --log-level info \
+    wsgi:application
+
+# Reload via SIGHUP
+ExecReload=/bin/kill -s HUP $MAINPID
+
+# Graceful stop
+KillMode=mixed
+TimeoutStopSec=30
+
+# Restart policy
+Restart=always
+RestartSec=10
+StartLimitBurst=3
+StartLimitInterval=60
+
+# Security settings
+NoNewPrivileges=true
+PrivateTmp=true
+ProtectSystem=strict
+ProtectHome=true
+ProtectKernelTunables=true
+ProtectKernelModules=true
+ProtectControlGroups=true
+RestrictRealtime=true
+RestrictSUIDSGID=true
+LockPersonality=true
+
+# Allow writing to specific directories
+ReadWritePaths=/var/log/talk2me /tmp/talk2me_uploads
+
+# Resource limits
+LimitNOFILE=65536
+LimitNPROC=4096
+
+# Memory limits (adjust based on your system)
+MemoryLimit=4G
+MemoryHigh=3G
+
+# CPU limits (optional)
+# CPUQuota=200%
+
+[Install]
+WantedBy=multi-user.target