noScribe: The Free & Open-Source AI Audio Transcription Tool for Researchers and Journalists

What is noScribe?

noScribe is an AI-powered software designed to automate the transcription of interviews and audio recordings for qualitative research or journalistic purposes. Developed by Kai Dröge, a sociology PhD with expertise in computer science, this tool combines cutting-edge AI models (Whisper from OpenAI, Faster-Whisper by Guillaume Klein, and Pyannote from Hervé Bredin) to deliver accurate results while maintaining complete data privacy—processing occurs entirely on your local device without internet transmission.

Key features include:

▸

Multilingual Support: Recognizes approximately 60 languages (Spanish, English, German perform best).
▸

Speaker Identification: Uses Pyannote AI to distinguish between multiple speakers.
▸

Local Processing: No cloud storage; data never leaves your computer.
▸

Built-In Editor: Review and refine transcripts with synchronized audio playback.
▸

Open Source: Released under GPL-3.0 license with transparent development on GitHub.

System Requirements & Installation

Windows (Version 0.6.2)

Standard Version (No GPU)

Download normal2 installer.
Run the setup file (~20GB download size).
Trust the app if prompted by Windows security (click “Run anyway”).
For enterprise deployments: Add /S parameter to silent install on multiple machines.

CUDA Accelerated Version (NVIDIA GPU ≥6GB VRAM)

Download cuda1 installer.
Install CUDA Toolkit (restart required).
Follow same installation steps as standard version.

MacOS (Version 0.6.2)

Apple Silicon M1-M4 Models

Download ARM64 dmg.
Drag noScribe and noScribeEdit to Applications folder.
Install Rosetta2 via terminal: softwareupdate --install-rosetta.
Launch apps from Applications folder.

Intel Architecture (Experimental)

▸
Use stable v0.5 for Big Sur/Monterey/Ventura:
- ▸
  
  Sonoma/Sequoia: Intel x86_64
- ▸
  
  Older versions: Legacy x86_64
▸

Bypass Gatekeeper warnings by manually allowing execution in Privacy Settings.

Linux (Version 0.6.2)

Precompiled Binary Installation

CPU version:
wget https://drive.switch.ch/s/HtKDKYRZRNaYBeI?path=%2FLinux/noScribe_0.6.2_cpu_linux_amd64.tar.gz && tar -xzvf noScribe_0.6.2_cpu_linux_amd64.tar.gz && ./noScribe
CUDA version (requires NVIDIA drivers):
wget https://drive.switch.ch/s/HtKDKYRZRNaYBeI?path=%2FLinux/noScribe_0.6.2_cuda_linux_amd64.tar.gz && tar -xzvf noScribe_0.6.2_cuda_linux_amd64.tar.gz && ./noScribe

From Source Code (Recommended)

Clone repository:
git clone https://github.com/kaixxx/noScribe
Create virtual environment:
python3 -m venv .venv && source .venv/bin/activate
Install dependencies:
pip install -r requirements_linux.txt
Download pretrained models:
git clone https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo models/precise
Run: python3 ./noScribe.py

Core Features & Workflow

Transcription Process

Select Audio File: Supports MP3, WAV, AVI, MP4, and other formats.
Configure Settings:
- ▸
  
  Output Format: Choose HTML (default), VTT (video subtitles), or plain text (TXT).
- ▸
  
  Time Range: Define start/end times (e.g., 00:00:00 to 00:15:00) for partial transcriptions.
- ▸
  
  Language: Auto-detect or specify “multilingual” for mixed audio (experimental).
- ▸
  
  Quality Mode: “Precise” (high accuracy) vs “Fast” (quicker but requires manual review).
Start Transcription: Processing time varies by audio length and hardware; a 1-hour interview may take 3 hours on mid-range devices.
Auto-Save: Transcripts are saved every few seconds during processing.
Review with Editor: Open noScribeEdit to compare text with audio, fix errors, and adjust speaker labels.

Advanced Features

▸

Pause Detection: Marks silences longer than 1 second as (XX sec pause).
▸

Overlapping Speech: Demarcates simultaneous speaking with double slashes (//).
▸

Disfluencies: Transcribes filler words like “um,” “uh,” or unfinished sentences.
▸

Timestamp Integration: Optionally add timecodes at speaker changes or intervals (disabled by default).

Performance & Quality Considerations

Influencing Factors

Audio Quality: Clear voices with minimal background noise improve accuracy significantly.
Language Support: While Whisper handles ~60 languages, dialects (e.g., Swiss German) may require manual editing.
Hardware Capabilities: Faster CPUs/GPUs reduce processing time; older machines may benefit from overnight runs.
Model Choice: “Precise” mode uses larger models for better accuracy but slower speeds.

Common Challenges & Fixes

Problem	Workaround
AI looping on long audio	Split into segments <15 minutes each
Speaker misidentification	Manually verify labels in Editor
Hallucinations	Check against original audio for silent parts
Large model download	Prioritize network bandwidth during off-peak hours
Intel Mac compatibility	Use stable v0.5 or contribute to testing experimental v0.6.2

Advanced Customization Options

After initial use, config files are saved in:

▸

Windows: C:\Users\<username>\AppData\Local\noScribe\noScribe\config.yml
▸

MacOS: ~/Library/Application Support/noscribe/config.yml

Key customizable settings include:

▸

UI Language: Change via “locale” setting in config file.
▸

Custom Models: Add fine-tuned Whisper variants for specialized tasks (instructions available on Wiki).
▸

Logging: Access detailed transcript logs in ~/Library/Application Support/noscribe/log for debugging errors.

Comparison with Alternative Tools

Feature	noScribe	Otter.ai	Descript
Local Processing	✔️	❌ (Cloud-based)	❌
Multilingual Support	60+ languages	12+	English-only
Speaker Recognition	Pyannote model	Acoustic fingerprinting	None
Price	Free	$15/month Pro	$99/month Pro
Hardware Requirements	CPU/GPU optional	Cloud scalability	High-end GPU required
Open Source	GPL-3.0	Proprietary	Proprietary
Editor Integration	Native built-in	Web-based	Web-based

Development & Community Contributions

▸

Contribution Guidelines: Submit pull requests for bug fixes or translation updates (GitHub repository).
▸

Translation Status: UI strings translated into >15 languages; submit corrections via translation issue template.
▸

Testing Needed: Actively seek volunteers to test v0.6.2 on Intel Macs (Discussion).

Frequently Asked Questions (FAQ)

Q: Does noScribe work with Chinese audio?

A: Yes! While Whisper’s Chinese recognition is strong, dialects or technical jargon may require minor edits in the Editor. Try downloading region-specific models from Hugging Face for improved accuracy.

Q: Can I export transcripts to ATLAS.ti?

A: Yes! Save as VTT format (*.vtt) and import into ATLAS.ti directly; also compatible with MAXQDA and QualCoder via HTML export.

Q: Why does my transcript show weird characters?

A: Ensure your system’s font encoding matches the expected character set (e.g., Japanese uses MS Gothic, Korean uses Malgun Gothic). The Editor’s “Replace Special Chars” feature can help clean up irregularities.

Future Plans & Roadmap

Model Updates: Integrate Whisper v4 for enhanced small-language support and reduced hallucination rates.
Cross-Platform Mobile App: Port core functionality to iOS/Android via Electron framework.
Collaboration Features: Add real-time sync capabilities for team transcription projects.
Plugin Ecosystem: Allow third-party integrations with qualitative analysis tools like ExMaraLDA or Nvivo.

noScribe AI Transcription Tool: Open-Source Solution for Researchers & Journalists