Parakeet (Recommended Speech Transcription)
OpenFlow supports NVIDIA Parakeet TDT 0.6B v3 through a persistent sherpa-onnx worker. Open Settings > Models and click Install beside Parakeet. OpenFlow then:
- Downloads the pinned Windows
sherpa-onnxruntime and multilingual INT8 model (about 510 MB total download). - Verifies both archives with SHA-256 before extracting them.
- Stores the runtime under
%APPDATA%\openflow\bin\parakeet\. The model is stored under%APPDATA%\openflow\models\parakeet\when space allows. If that drive has less than 2.5 GB free, OpenFlow automatically uses the fixed drive with the most free space (for exampleE:\OpenFlow\models\parakeet\) and remembers that location. - Selects Parakeet and keeps its local server warm between dictations.
Parakeet automatically detects Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, Swedish, Russian, and Ukrainian. If a different language is selected, or if the Parakeet runtime fails, OpenFlow automatically falls back to Whisper.
The automatic Windows installation uses the CPU-compatible runtime for the most reliable setup. Advanced installations can provide another sherpa-onnx runtime with OPENFLOW_PARAKEET_SERVER, select its execution provider with OPENFLOW_PARAKEET_PROVIDER, or point to an existing model directory with OPENFLOW_PARAKEET_MODEL_DIR.
Whisper (Fallback Speech Transcription)
OpenFlow retains its local Whisper engine as a selectable transcription engine and as Parakeet's automatic fallback. Within the Whisper engine, OpenFlow tries whisper.cpp first, then faster-whisper, then the classic whisper CLI.
Default Model: Tiny
The Tiny model (75MB) is the default because it gives the lowest local latency. Use Base or Small if you want more accuracy and can tolerate a slightly slower turnaround.
Fastest Transcription Backend
For near-instant local transcription, package whisper.cpp with OpenFlow or install it locally. OpenFlow looks for whisper-cli / whisper-cpp in bundled resources, next to the app executable, in %APPDATA%\openflow\bin\, and on PATH. Put the matching model file in bundled resources or:
Windows: %APPDATA%\openflow\models\whisper\
OpenFlow looks for names like ggml-base.bin, ggml-small.bin, or ggml-large-v3.bin. To point to an exact model file, start OpenFlow from PowerShell with:
$env:OPENFLOW_WHISPER_CPP_MODEL = "C:\path\to\ggml-base.bin"
npm run tauri dev
faster-whisper is still a good fallback if you already have it installed in the same Python environment available to OpenFlow:
pip install faster-whisper
To force a backend for troubleshooting, set OPENFLOW_WHISPER_BACKEND to whisper.cpp, faster-whisper, or whisper before launching:
$env:OPENFLOW_WHISPER_BACKEND = "whisper.cpp"
npm run tauri dev
Supported Models
| Model | Size | Description |
|---|---|---|
| Tiny | 75 MB | Default, lowest-latency profile |
| Base | 145 MB | Better balance of accuracy and speed |
| Small | 460 MB | Balanced accuracy and latency |
| Medium | 1.5 GB | Exceptional language translation |
| Large-v3 | 2.9 GB | Maximum precision, high resources |
Model Storage Location
Models are stored in your app data directory:
Windows: %APPDATA%\openflow\models\whisper\
Selecting a Model
- Open Settings > Models
- Click "Download" next to the model you want
- Once downloaded, click "Select" to make it active
Cleanup Model (Text Refinement)
OpenFlow uses a local LLM to clean up raw speech transcripts.
Recommended Model: Gemma 4 E2B it (GGUF)
- Hugging Face Repo: unsloth/gemma-4-E2B-it-GGUF
- Recommended Quantization: Q4_K_M or Q5_K_M
- Purpose: Corrects filler words, false starts, and spoken punctuation
Option A: llama.cpp Server (Recommended)
- Install llama.cpp or download the
llama-serverbinary - Download a GGUF model file from the Hugging Face repo above
- Start the server:
llama-server -hf unsloth/gemma-4-E2B-it-GGUF
Or with a local file:
llama-server -m path/to/gemma-4-e2b-it-Q4_K_M.gguf --port 8080
- In OpenFlow, go to Settings > Models
- Set the cleanup model provider to "Local llama.cpp server"
- Set the base URL to
http://127.0.0.1:8080/v1 - The app will automatically connect and use it for text cleanup
Option B: Rule-Based Only (No LLM Required)
If you don't have a local LLM server running, OpenFlow automatically falls back to rule-based cleanup, which handles:
- Filler word removal (um, uh, like, you know)
- Correction handling (I mean, actually, scratch that)
- Spoken punctuation (comma, period, question mark)
- Formatting commands (new line, bullet point)
- Basic capitalization and sentence punctuation
This works entirely offline with no additional setup.
Model Storage Location
Cleanup models are stored in:
Windows: %APPDATA%\openflow\models\cleanup\
Checking Model Status
The sidebar shows "Local Whisper Active" with the engine name when transcription is running. The Settings > Models page shows cleanup model status and connection state.