Don't show again
Back to Templates Library
Can't be used normally? Clickhereto let us know.
nvidia/parakeet-tdt-0.6b-v2
Favorite
Copy Link
keli
viktor742/openapi-parakeet-tdt-0.6b-v2:0.2.1
Updated time: 03 Jun 2025
README
Configuration
Speech Transcription API
A FastAPI-based REST API service for speech-to-text transcription using NVIDIA's parakeet-tdt-0.6b-v2 model. This API provides high-quality English speech recognition with automatic punctuation, capitalization, and accurate word-level timestamps.
GitHub Repo: https://github.com/viktor2077/parakeet-tdt-0.6b-v2
Features
- 🎤 High-Quality Transcription: Uses NVIDIA's 600M parameter parakeet-tdt-0.6b-v2 model
- ⏱️ Accurate Timestamps: Provides word-level timing information
- 📝 Multiple Output Formats: JSON response or SRT subtitle format
- 🔧 Automatic Audio Processing: Handles resampling and channel conversion
- 🚀 Long Audio Support: Optimized settings for audio longer than 8 minutes
- 📊 OpenAPI Compatible: Full Swagger/OpenAPI documentation
- 🛡️ Error Handling: Comprehensive error handling and validation