noScribe: The Free & Open-Source AI Audio Transcription Tool for Researchers and Journalists
What is noScribe?
noScribe is an AI-powered software designed to automate the transcription of interviews and audio recordings for qualitative research or journalistic purposes. Developed by Kai Dröge, a sociology PhD with expertise in computer science, this tool combines cutting-edge AI models (Whisper from OpenAI, Faster-Whisper by Guillaume Klein, and Pyannote from Hervé Bredin) to deliver accurate results while maintaining complete data privacy—processing occurs entirely on your local device without internet transmission.
Key features include:
- ▸
Multilingual Support: Recognizes approximately 60 languages (Spanish, English, German perform best). - ▸
Speaker Identification: Uses Pyannote AI to distinguish between multiple speakers. - ▸
Local Processing: No cloud storage; data never leaves your computer. - ▸
Built-In Editor: Review and refine transcripts with synchronized audio playback. - ▸
Open Source: Released under GPL-3.0 license with transparent development on GitHub.
System Requirements & Installation
Windows (Version 0.6.2)
Standard Version (No GPU)
-
Download normal2 installer. -
Run the setup file (~20GB download size). -
Trust the app if prompted by Windows security (click “Run anyway”). -
For enterprise deployments: Add /S
parameter to silent install on multiple machines.
CUDA Accelerated Version (NVIDIA GPU ≥6GB VRAM)
-
Download cuda1 installer. -
Install CUDA Toolkit (restart required). -
Follow same installation steps as standard version.
MacOS (Version 0.6.2)
Apple Silicon M1-M4 Models
-
Download ARM64 dmg. -
Drag noScribe and noScribeEdit to Applications folder. -
Install Rosetta2 via terminal: softwareupdate --install-rosetta
. -
Launch apps from Applications folder.
Intel Architecture (Experimental)
- ▸
Use stable v0.5 for Big Sur/Monterey/Ventura: - ▸
Sonoma/Sequoia: Intel x86_64 - ▸
Older versions: Legacy x86_64
- ▸
- ▸
Bypass Gatekeeper warnings by manually allowing execution in Privacy Settings.
Linux (Version 0.6.2)
Precompiled Binary Installation
-
CPU version:
wget https://drive.switch.ch/s/HtKDKYRZRNaYBeI?path=%2FLinux/noScribe_0.6.2_cpu_linux_amd64.tar.gz && tar -xzvf noScribe_0.6.2_cpu_linux_amd64.tar.gz && ./noScribe
-
CUDA version (requires NVIDIA drivers):
wget https://drive.switch.ch/s/HtKDKYRZRNaYBeI?path=%2FLinux/noScribe_0.6.2_cuda_linux_amd64.tar.gz && tar -xzvf noScribe_0.6.2_cuda_linux_amd64.tar.gz && ./noScribe
From Source Code (Recommended)
-
Clone repository:
git clone https://github.com/kaixxx/noScribe
-
Create virtual environment:
python3 -m venv .venv && source .venv/bin/activate
-
Install dependencies:
pip install -r requirements_linux.txt
-
Download pretrained models:
git clone https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo models/precise
-
Run: python3 ./noScribe.py
Core Features & Workflow
Transcription Process
-
Select Audio File: Supports MP3, WAV, AVI, MP4, and other formats. -
Configure Settings: - ▸
Output Format: Choose HTML (default), VTT (video subtitles), or plain text (TXT). - ▸
Time Range: Define start/end times (e.g., 00:00:00
to00:15:00
) for partial transcriptions. - ▸
Language: Auto-detect or specify “multilingual” for mixed audio (experimental). - ▸
Quality Mode: “Precise” (high accuracy) vs “Fast” (quicker but requires manual review).
- ▸
-
Start Transcription: Processing time varies by audio length and hardware; a 1-hour interview may take 3 hours on mid-range devices. -
Auto-Save: Transcripts are saved every few seconds during processing. -
Review with Editor: Open noScribeEdit to compare text with audio, fix errors, and adjust speaker labels.
Advanced Features
- ▸
Pause Detection: Marks silences longer than 1 second as (XX sec pause)
. - ▸
Overlapping Speech: Demarcates simultaneous speaking with double slashes (//). - ▸
Disfluencies: Transcribes filler words like “um,” “uh,” or unfinished sentences. - ▸
Timestamp Integration: Optionally add timecodes at speaker changes or intervals (disabled by default).
Performance & Quality Considerations
Influencing Factors
-
Audio Quality: Clear voices with minimal background noise improve accuracy significantly. -
Language Support: While Whisper handles ~60 languages, dialects (e.g., Swiss German) may require manual editing. -
Hardware Capabilities: Faster CPUs/GPUs reduce processing time; older machines may benefit from overnight runs. -
Model Choice: “Precise” mode uses larger models for better accuracy but slower speeds.
Common Challenges & Fixes
Advanced Customization Options
After initial use, config files are saved in:
- ▸
Windows: C:\Users\<username>\AppData\Local\noScribe\noScribe\config.yml
- ▸
MacOS: ~/Library/Application Support/noscribe/config.yml
Key customizable settings include:
- ▸
UI Language: Change via “locale” setting in config file. - ▸
Custom Models: Add fine-tuned Whisper variants for specialized tasks (instructions available on Wiki). - ▸
Logging: Access detailed transcript logs in ~/Library/Application Support/noscribe/log
for debugging errors.
Comparison with Alternative Tools
Development & Community Contributions
- ▸
Contribution Guidelines: Submit pull requests for bug fixes or translation updates (GitHub repository). - ▸
Translation Status: UI strings translated into >15 languages; submit corrections via translation issue template. - ▸
Testing Needed: Actively seek volunteers to test v0.6.2 on Intel Macs (Discussion).
Frequently Asked Questions (FAQ)
Q: Does noScribe work with Chinese audio?
A: Yes! While Whisper’s Chinese recognition is strong, dialects or technical jargon may require minor edits in the Editor. Try downloading region-specific models from Hugging Face for improved accuracy.
Q: Can I export transcripts to ATLAS.ti?
A: Yes! Save as VTT format (*.vtt
) and import into ATLAS.ti directly; also compatible with MAXQDA and QualCoder via HTML export.
Q: Why does my transcript show weird characters?
A: Ensure your system’s font encoding matches the expected character set (e.g., Japanese uses MS Gothic
, Korean uses Malgun Gothic
). The Editor’s “Replace Special Chars” feature can help clean up irregularities.
Future Plans & Roadmap
-
Model Updates: Integrate Whisper v4 for enhanced small-language support and reduced hallucination rates. -
Cross-Platform Mobile App: Port core functionality to iOS/Android via Electron framework. -
Collaboration Features: Add real-time sync capabilities for team transcription projects. -
Plugin Ecosystem: Allow third-party integrations with qualitative analysis tools like ExMaraLDA or Nvivo.