Back to Templates Library
Can't be used normally? Clickhereto let us know.
nvidia/parakeet-tdt-0.6b-v2
Favorite
Copy Link
keli
viktor742/openapi-parakeet-tdt-0.6b-v2:0.2.1
Updated time: 03 Jun 2025
README
Configuration

Speech Transcription API

A FastAPI-based REST API service for speech-to-text transcription using NVIDIA's parakeet-tdt-0.6b-v2 model. This API provides high-quality English speech recognition with automatic punctuation, capitalization, and accurate word-level timestamps.

GitHub Repo: https://github.com/viktor2077/parakeet-tdt-0.6b-v2

Features

  • 🎀 High-Quality Transcription: Uses NVIDIA's 600M parameter parakeet-tdt-0.6b-v2 model
  • ⏱️ Accurate Timestamps: Provides word-level timing information
  • πŸ“ Multiple Output Formats: JSON response or SRT subtitle format
  • πŸ”§ Automatic Audio Processing: Handles resampling and channel conversion
  • πŸš€ Long Audio Support: Optimized settings for audio longer than 8 minutes
  • πŸ“Š OpenAPI Compatible: Full Swagger/OpenAPI documentation
  • πŸ›‘οΈ Error Handling: Comprehensive error handling and validation