How to transcribe video offline, free, in 14+ languages
To transcribe a video or audio file offline, use a tool that runs the speech model on your own machine instead of a cloud service. PandaStudio does exactly that: drop in an mp4, mov, mp3, wav, or m4a, and a clean transcript with millisecond timestamps comes back without anything ever leaving your computer. There is no upload, no per-minute fee, and once the model is cached it works with no internet connection at all. It runs on macOS and Windows, and it is free to start.
Under the hood PandaStudio ships two on-device speech engines and routes your audio to the right one based on the language you pick. Here is exactly what each covers, and how to run a transcription.
Languages supported on-device
- English plus 25 European languages, auto-detected run on the default engine (Parakeet TDT v3, about 473 MB). It auto-detects the language, keeps filler words like "um" and "uh" for accurate transcript-based editing, and gives word-level timing.
- 14 more languages run on a second engine (Whisper Large-v3-turbo, about 1.1 GB, downloaded once the first time you pick one of them): Chinese, Japanese, Korean, Hindi, Arabic, Thai, Tamil, Telugu, Kannada, Malayalam, Bengali, Marathi, Gujarati, Punjabi.
- Switching is one setting. Open Settings, choose your language, and PandaStudio loads the right model. The choice is saved per workspace, so an English channel and a Tamil channel can live side by side without crosstalk.
Why offline and on-device matters
- Nothing is uploaded. Recording, transcription, and editing all happen on your computer. For client calls, interviews, and medical or legal recordings, the file never touches a third-party server.
- No per-minute billing. Cloud transcription services charge by the minute. On-device transcription has no metered cost, so a long back catalogue costs nothing extra to process.
- It works with no connection. Once the model is cached, you can transcribe on a plane or anywhere offline.
Two ways to run it
The simplest way is to ask your AI assistant. PandaStudio ships an official Skill that teaches ChatGPT, Claude, or any MCP agent its full command surface, so you can say "transcribe this interview and give me an SRT" in plain English and get the file back. The transcript carries word-level and segment-level timestamps in milliseconds, which map straight to SRT or VTT.
Prefer to script it? The same actions are available from the command line and the step-by-step transcription guide walks through wrapping a file, running the job, and reading back the timestamps.
Frequently asked
Can you transcribe video offline, without uploading it?
Yes. PandaStudio runs the speech model on your own computer, so your file is never uploaded. Once the model is cached, transcription works fully offline.
Which languages can it transcribe on-device?
English plus 25 European languages, auto-detected, plus 14 more on a second model: Chinese, Japanese, Korean, Hindi, Arabic, Thai, Tamil, Telugu, Kannada, Malayalam, Bengali, Marathi, Gujarati, Punjabi.
Is local transcription free?
There is no per-minute charge. The app is free to download with 3 trial exports, and a one-time 99 USD license unlocks unlimited exports forever.
Does it produce timestamps for subtitles?
Yes. Both engines return word-level and segment-level timestamps in milliseconds, which convert directly to SRT or VTT subtitle files.
Transcribe your first file offline in minutes
Download PandaStudio free, and once your file is in, you are one sentence away from the rest of the editor too: cut filler words from the transcript, add captions, and export, all on your own machine.