Transcribe Audio and Video for Free
As an L&D professional working for a multinational company, I needed a reliable way to generate transcripts for our eLearning videos and translate them into other languages. There are plenty of tools that claim to do this, but most were either too expensive, too complicated, or just didn’t work the way I needed them to.
So I built Voxtext.
Voxtext takes your audio or video files and converts them, on your local machine, into text, .srt, .vtt (including styling cues), HTML, Markdown, and JSON. When you’re done, you can use custom AI prompts to clean up transcripts, translate .vtt files, and perform other post-processing tasks. More translation options are already on the roadmap. Best of all, it’s free. Those download buttons below are real.
Using Voxtext is simple. Drag in your file, choose a transcription model (Medium is a good balance of speed and accuracy), select your output format, and click Transcribe.
Processing speed depends on your hardware, but on a modern Windows laptop an hour of video can typically be transcribed in about 30 minutes. Output files are delivered next to the source file, so they are always easy to find!
Mac version coming soon!
v1.5. Double click to launch the Windows installer.What Voxtext is good for
- eLearning videos
- Training content
- Webinar recordings
- Podcast transcripts
- Subtitle generation
- VTT file translation
- Accessibility workflows
- Local/offline transcription

