- Remove hardcoded TTS API key from app.py (major security vulnerability) - Add python-dotenv support for secure environment variable management - Create .env.example with configuration template - Add comprehensive SECURITY.md documentation - Update README with security configuration instructions - Add warning when TTS_API_KEY is not configured - Enhance .gitignore to prevent accidental commits of .env files BREAKING CHANGE: TTS_API_KEY must now be set via environment variable or .env file Security measures: - API keys must be provided via environment variables - Added dotenv support for local development - Clear documentation on secure deployment practices - Multiple .env file patterns in .gitignore 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2.7 KiB
Voice Language Translator
A mobile-friendly web application that translates spoken language between multiple languages using:
- Gemma 3 open-source LLM via Ollama for translation
- OpenAI Whisper for speech-to-text
- OpenAI Edge TTS for text-to-speech
Supported Languages
- Arabic
- Armenian
- Azerbaijani
- English
- French
- Georgian
- Kazakh
- Mandarin
- Farsi
- Portuguese
- Russian
- Spanish
- Turkish
- Uzbek
Setup Instructions
-
Install the required Python packages:
pip install -r requirements.txt
-
Configure environment variables:
# Copy the example environment file cp .env.example .env # Edit with your actual values nano .env # Or set directly: export TTS_API_KEY="your-tts-api-key" export SECRET_KEY="your-secret-key"
⚠️ Security Note: Never commit API keys or secrets to version control. See SECURITY.md for details.
-
Make sure you have Ollama installed and the Gemma 3 model loaded:
ollama pull gemma3
-
Ensure your OpenAI Edge TTS server is running on port 5050.
-
Run the application:
python app.py
-
Open your browser and navigate to:
http://localhost:8000
Usage
- Select your source language from the dropdown menu
- Press the microphone button and speak
- Press the button again to stop recording
- Wait for the transcription to complete
- Select your target language
- Press the "Translate" button
- Use the play buttons to hear the original or translated text
Technical Details
- The app uses Flask for the web server
- Audio is processed client-side using the MediaRecorder API
- Whisper for speech recognition with language hints
- Ollama provides access to the Gemma 3 model for translation
- OpenAI Edge TTS delivers natural-sounding speech output
CORS Configuration
The application supports Cross-Origin Resource Sharing (CORS) for secure cross-origin usage. See CORS_CONFIG.md for detailed configuration instructions.
Quick setup:
# Development (allow all origins)
export CORS_ORIGINS="*"
# Production (restrict to specific domains)
export CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
export ADMIN_CORS_ORIGINS="https://admin.yourdomain.com"
Connection Retry & Offline Support
Talk2Me handles network interruptions gracefully with automatic retry logic:
- Automatic request queuing during connection loss
- Exponential backoff retry with configurable parameters
- Visual connection status indicators
- Priority-based request processing
See CONNECTION_RETRY.md for detailed documentation.
Mobile Support
The interface is fully responsive and designed to work well on mobile devices.