Introduction
In academic writing, technical documentation, or educational materials, you often encounter the need to convert Markdown documents—containing mathematical formulas, chemical equations, and flowcharts—into polished Word files. This guide presents a Docker + Pandoc workflow that packages all dependencies in a container, isolates your environment, and ensures consistent, repeatable results across Windows, macOS, and Linux.
Whether you are a junior college graduate or an experienced professional, this step-by-step tutorial will help you:
-
Install and configure Docker and its components. -
Build a customized Pandoc image with support for LaTeX math, mhchem chemistry, Mermaid diagrams, and Chinese fonts. -
Prepare sample Markdown files with formulas and charts. -
Create an automated script to perform one-click conversion. -
Tackle common issues and fine-tune advanced options.
By the end, you will be able to generate high-quality Word documents that faithfully render complex content, all without installing Pandoc, LaTeX, Node.js, or Python on your host machine.
1. Prerequisites and System Setup
1.1 Hardware and OS Requirements
-
「Operating System」: Windows 10/11 (64-bit, Pro/Enterprise/Education, build 2004+), macOS, or any modern Linux distribution. -
「CPU Virtualization」: Ensure Intel VT-x or AMD-V is enabled in your BIOS/UEFI. -
「Memory」: Minimum 4 GB RAM (8 GB+ recommended). -
「Storage」: At least 20 GB free disk space for Docker images and containers.
1.2 Enabling Virtualization on Windows
Open PowerShell as Administrator and run:
# Enable Hyper-V
dism.exe /Online /Enable-Feature:Microsoft-Hyper-V /All
# Enable WSL 2
dism.exe /Online /Enable-Feature /Featurename:Microsoft-Windows-Subsystem-Linux /All /norestart
dism.exe /Online /Enable-Feature /Featurename:VirtualMachinePlatform /All /norestart
Reboot your PC, then set WSL 2 as default:
wsl --set-default-version 2
Image source: Unsplash
1.3 Installing Docker Desktop
-
Download Docker Desktop from the official site. -
During installation, enable 「WSL 2」 integration. -
Launch Docker Desktop and complete initial setup.
2. Configuring Registry Mirrors for Faster Pulls
In regions with network constraints, pulling large Docker images can be slow. Configure mirror endpoints:
「Via Docker GUI」:
-
Open Docker Desktop → Settings → Docker Engine. -
Insert:
{
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"https://hub-mirror.c.163.com"
],
"dns": ["8.8.8.8", "114.114.114.114"]
}
-
Click 「Apply & Restart」.
「Via PowerShell」:
mkdir $env:USERPROFILE\.docker -Force
@"
{
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"https://hub-mirror.c.163.com"
],
"dns": ["8.8.8.8", "114.114.114.114"]
}
"@ | Set-Content "$env:USERPROFILE\.docker\daemon.json"
Restart-Service docker
3. Crafting a Custom Pandoc Docker Image
To support math formulas, chemical notation, diagrams, and Chinese text, extend the official Pandoc + LaTeX image.
# Base image with full LaTeX
FROM pandoc/latex:3.1.13
# Install Node.js and Mermaid CLI
RUN apt-get update && \
apt-get install -y nodejs npm && \
npm install -g @mermaid-js/mermaid-cli@10.0.2
# Install Python filters
RUN apt-get install -y python3-pip && \
pip3 install pandoc-mermaid-filter pandoc-mhchem
# Add Chinese font support
RUN apt-get install -y fonts-noto-cjk
WORKDIR /data
Save as 「Dockerfile」, then build:
docker build -t pandoc-advanced .
This image, named pandoc-advanced
, bundles all required tools.
4. Preparing Sample Markdown Content
Create 「input.md」 with the following:
---
mainfont: Noto Sans CJK SC
mathfont: Noto Sans CJK SC
---
# Scientific Document Demo
## 1. Chemical Equation
\ce{2H2 + O2 -> 2H2O}
## 2. Physics Formula
$$
F = \frac{G m_1 m_2}{r^2}
$$
## 3. Process Flow
```mermaid
graph TD
A[Start] --> B[Process Data]
B --> C{Decision}
C -->|Yes| D[Option 1]
C -->|No| E[Option 2]
D --> F[End]
E --> F
4. Complex Chemistry
\ce{Zn^2+ <=>[+ 2OH-][+ 2H+] Zn(OH)2 v}
This file includes metadata for Chinese fonts, inline chemistry, LaTeX math, and a Mermaid diagram.
---
## 5. One-Click Conversion Script
Create **convert.ps1** in the same folder:
```powershell
param(
[Parameter(Mandatory=$true)] [string]$InputFile,
[string]$OutputFile = [System.IO.Path]::ChangeExtension($InputFile, 'docx')
)
$outputDir = 'output'
if (-not (Test-Path $outputDir)) {
New-Item -ItemType Directory -Path $outputDir | Out-Null
}
if (-not (Test-Path $InputFile)) {
Write-Error "File not found: $InputFile"
exit 1
}
# Run conversion in Docker
docker run --rm \
-v "${PWD}:/data" \
pandoc-advanced \
pandoc "/data/$InputFile" \
--from=markdown+tex_math_single_backslash+tex_mhchem \
--to=docx \
--filter pandoc-mermaid \
--filter pandoc-mhchem \
--resource-path="/data" \
--standalone \
--output="/data/$outputDir/$OutputFile"
if (Test-Path "$outputDir/$OutputFile") {
Write-Host "Conversion succeeded: $outputDir\$OutputFile"
} else {
Write-Error "Conversion failed"
}
「Usage」:
Set-ExecutionPolicy Bypass -Scope Process -Force
.\convert.ps1 -InputFile 'input.md'
6. Verifying the Output
Open 「output/input.docx」. You should see:
-
High-quality rendering of the chemical equation. -
Clear LaTeX math formulas. -
Embedded Mermaid flowchart. -
Correct display of complex chemical notation.
If something is missing, move to the troubleshooting section.
7. Advanced Options
-
「Custom Reference Document」:
pandoc --print-default-data-file reference.docx > custom-reference.docx
Then add
--reference-doc='/data/custom-reference.docx'
to your command. -
「Table of Contents & Numbering」: Append
--toc --number-sections
. -
「Font Metadata」: Already shown in sample.
8. Troubleshooting
8.1 Image Pull Timeout
-
Test connectivity:
Test-NetConnection registry-1.docker.io -Port 443
-
Use offline load:
docker save -o pandoc.tar pandoc/latex:3.1.13 docker load -i pandoc.tar
8.2 pandoc: withBinaryFile missing
-
Ensure correct volume mapping: "${PWD}:/data"
. -
Verify Docker disk sharing settings.
8.3 Mermaid Diagram Not Rendering
docker run --rm pandoc-advanced pip3 show pandoc-mermaid-filter
docker run --rm -v ${PWD}:/data pandoc-advanced mmdc -i /data/input.mmd -o /data/output.png
8.4 Font or Encoding Issues
-
Confirm installation of fonts-noto-cjk
. -
Save Markdown as UTF-8 without BOM.
9. Conclusion
This Docker + Pandoc workflow empowers you to produce professional Word documents from Markdown sources containing formulas, chemistry, and diagrams. By encapsulating all tools in a container, you avoid local installation headaches, enjoy cross-platform consistency, and gain a scalable, repeatable pipeline suitable for academic, technical, or educational publishing.
Embrace this streamlined approach to focus on content creation rather than environment setup. Happy writing and documenting!