Robust Speech Recognition via Large-Scale Weak Supervision
Speech-to-text, text-to-speech, and speaker recognition
Audio foundation model excelling in audio understanding
kaldi-asr/kaldi is the official location of the Kaldi project
A PyTorch-based Speech Toolkit
Captcha solver extension for humans
A free, open source, and extensible speech-to-text application
Port of OpenAI's Whisper model in C/C++
Cross-platform AI language practice app
Toolkit for conversational AI
StreamSpeech is a seamless model for offline speech recognition
Multilingual Automatic Speech Recognition with word-level timestamps
OpenVINO™ Toolkit repository
Underthesea - Vietnamese NLP Toolkit
Repo of Qwen2-Audio chat & pretrained large audio language model
A cross-platform software for text translation and recognition
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Speech to Text to Speech, sends text as OSC messages
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
AzioSpeech Recognition and Translation
Replace OpenAI GPT with another LLM in your app
Training data (data labeling, annotation, workflow) for all data types
End-to-end speech processing toolkit
Capable of understanding text, audio, vision, video
The behavior guidance framework for customer-facing LLM agents