Speech to Text Apps
Speech to Text
- DeepSpeech : simpler although inferior
- Kaldi : STT supports hybrid NN-HMM and lattice-free MMI models. Kaldi is used by many people both in research and in production.
- Lingvo is the open source version of Google speech recognition toolkit, with support mostly for end-to-end models.
- ESPNet is good and well known for end-to-end models as well.
- RASR + RETURNN are very good as well, both for end-to-end models and hybrid NN-HMM, but they are for non-commercial applications only (or you need a commercial licence) (disclaimer: I work at the university chair which develops these frameworks).
- Wav2Letter, the tool by Facebook.
- snakers4/silero-models at mlnews Silero Speech to Text
- coqui Coqui STT and TTS
[voice2json Command-line tools for speech and intent recognition on Linux](https://voice2json.org/#supported-languages)
- VOSK Offline Speech Recognition API
- English: Tedlium, Librispeech, etc.