PDF3MD converts PDFs to Markdown and Markdown to Word via a local Docker web app.
The Ultimate Guide to PDF3MD: Deploying a Professional PDF to Markdown and Word Converter
In the modern workflow of content creation and documentation, the ability to fluidly move between PDF documents, Markdown text, and Microsoft Word files is invaluable. Whether you are archiving technical papers, preparing documentation for a static site generator, or drafting reports in a collaborative environment, the rigidity of file formats can often halt productivity.
Enter PDF3MD, a robust web application designed specifically for high-fidelity document conversion. Built with a modern technology stack comprising a React-based frontend and a Python Flask backend, PDF3MD offers a seamless user experience. It doesn’t just convert files; it provides a comprehensive interface with real-time progress tracking, multi-file processing, and a drag-and-drop user experience.
This guide serves as a definitive resource for understanding, deploying, and optimizing PDF3MD. We will walk through every aspect of the application, from the underlying technology to detailed deployment strategies using Docker and manual setup, ensuring you have the expertise to integrate this tool into your production environment.
Core Capabilities and Technical Architecture
Before diving into the installation, it is crucial to understand what makes PDF3MD tick. The application is engineered to handle two primary conversion tasks with high precision:
-
PDF to Markdown Conversion: It transforms complex PDF documents into clean, readable Markdown. This process preserves structural elements, ensuring that headings, paragraphs, and lists remain intact. -
Markdown to Word (DOCX) Conversion: It reverses the process, converting user-provided Markdown text into DOCX format. This leverages the power of Pandoc to ensure the output maintains high fidelity to the original styling.
The Technology Stack Behind the Efficiency
The reliability of PDF3MD stems from its carefully selected technology stack:
-
Frontend: The user interface is built using React and scaffolded with Vite. This combination ensures a highly responsive, modern, and intuitive user interface that feels like a native desktop application. -
Backend: The server-side logic is powered by Python using the Flask microframework. This choice allows for lightweight yet powerful handling of file uploads and background processing tasks. -
PDF Processing: The core engine for PDF extraction is PyMuPDF4LLM. This library is responsible for parsing PDF content accurately, converting it into structured Markdown without losing formatting nuances. -
Markdown to DOCX Conversion: For generating Word documents, the application utilizes Pandoc. Pandoc is the gold standard for document conversion, ensuring that your resulting .docxfiles are professional and ready for editing.
Key Features That Enhance Productivity
Beyond simple conversion, PDF3MD integrates several features designed for professional workflows:
-
Multi-File Upload: You are not limited to converting one file at a time. The interface supports uploading and processing multiple PDF files simultaneously, significantly speeding up bulk tasks. -
Real-Time Progress Tracking: For each file processed, the interface provides detailed status updates. You no longer have to guess if the conversion is hanging or completed; the progress bars keep you informed. -
Drag & Drop Interface: The user experience is streamlined with a drag-and-drop zone, alongside traditional file selection buttons. -
File Metadata: Upon completion, the system displays essential information such as the original filename, file size, page count, and the exact timestamp of the conversion.
Deployment Strategy 1: Docker (Recommended)
For most users, especially those aiming for a production-ready environment or a quick local test, utilizing the pre-built Docker images is the most efficient path. This method isolates dependencies and ensures consistency across different machines.
Prerequisites
To proceed with the Docker deployment, ensure your system has Docker Engine and Docker Compose installed. Docker Compose is typically included with Docker Desktop installations.
Step 1: Preparing the Configuration Files
The deployment requires two specific files to be placed in a dedicated directory.
-
Create a Directory:
Open your terminal and create a clean workspace for the application:mkdir pdf3md-app && cd pdf3md-app -
Create
docker-compose.yml:
Inside this directory, create a file nameddocker-compose.yml. This YAML file orchestrates the services. Paste the following configuration into it:services: backend: image: docker.io/learnedmachine/pdf3md-backend:latest container_name: pdf3md-backend ports: - "6201:6201" environment: - PYTHONUNBUFFERED=1 - FLASK_ENV=production - TZ=America/Chicago volumes: - ./pdf3md/temp:/app/temp # Creates a local temp folder for backend processing if needed restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:6201/"] interval: 30s timeout: 10s retries: 3 start_period: 40s frontend: image: docker.io/learnedmachine/pdf3md-frontend:latest container_name: pdf3md-frontend ports: - "3000:3000" depends_on: - backend restart: unless-stopped healthcheck: test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/"] interval: 30s timeout: 10s retries: 3 start_period: 40s networks: default: name: pdf3md-networkConfiguration Breakdown:
-
Backend Service: This service maps port 6201on the host to port6201in the container. It sets the environment to production and defines a health check that curls the localhost endpoint every 30 seconds. If the check fails 3 times with a 10-second timeout, the container is marked unhealthy. Therestart: unless-stoppedpolicy ensures resilience. -
Frontend Service: This maps port 3000and depends on the backend, ensuring startup order. It useswgetto check its own health on port 3000. -
Volumes: The backend mounts a local directory ./pdf3md/tempto/app/tempinside the container, preserving temporary files locally if needed.
-
-
Download
docker-start.sh:
You need the management script to easily control the application. Download thedocker-start.shscript directly from the pdf3md GitHub repository’s main branch and place it in yourpdf3md-appdirectory.Once downloaded, you must make it executable:
chmod +x ./docker-start.sh
Step 2: Launching the Application
With the files in place, starting the application is straightforward.
-
Standard Production Start:
Run the following command in the directory containing your files:./docker-start.sh startThis command pulls the latest images from Docker Hub and initializes the services defined in your compose file.
-
Custom Domain/IP Start:
If you need to access the application from other devices on your Local Area Network (LAN) or via a specific domain:./docker-start.sh start example.comThis is particularly useful when the frontend needs to communicate with the backend using a specific hostname visible to other clients on the network.
Step 3: Accessing PDF3MD
Once the containers are up and running, you can access the services via your browser:
-
Default Settings: -
Frontend Interface: http://localhost:3000 -
Backend API: http://localhost:6201
-
-
Custom Domain (e.g., using example.com):-
Frontend Interface: http://example.com:3000 -
Backend API: http://example.com:6201
-
Managing the Docker Services
The docker-start.sh script provides several utility commands to manage your deployment lifecycle:
-
./docker-start.sh status: Check the current status of the running containers. -
./docker-start.sh stop: Gracefully stop all services. -
./docker-start.sh logs: View the output logs from the frontend and backend, which is essential for debugging. -
./docker-start.sh rebuild dev example.com: Rebuild the development environment with a specified domain. -
./docker-start.sh help: Display all available command options.
Deployment Strategy 2: Development Mode (Docker with Hot-Reload)
For developers who intend to modify the source code, a development mode is available. This mode utilizes the docker-compose.dev.yml file and enables hot-reloading, meaning changes to your code are reflected immediately without restarting containers.
1. Clone the Repository
Development requires full access to the source code:
git clone https://github.com/murtaza-nasir/pdf3md.git
cd pdf3md
Note that cloning the repository includes the necessary docker-compose.dev.yml file.
2. Start Development Environment
Execute the script with the dev flag:
./docker-start.sh dev
This mounts your local source code into the containers and starts the Vite dev server.
-
Accessing Dev Mode: -
Frontend (with Hot-Reload): http://localhost:5173 -
Backend API: http://localhost:6201
You can also specify a custom IP for LAN testing in dev mode:
-
./docker-start.sh dev 192.168.1.100
Deployment Strategy 3: Manual Setup (Without Docker)
If you prefer to run the services natively on your machine or Docker is not an option, you can set up the frontend and backend manually. This path is often chosen for deep debugging or preference for native environments.
Prerequisites
-
Python 3.8 or higher -
Node.js 16 or higher -
Git
Step 1: Clone the Repository
git clone https://github.com/murtaza-nasir/pdf3md.git
cd pdf3md
Step 2: Backend Setup (Terminal 1)
Navigate to the backend application directory and install the required Python dependencies:
cd pdf3md # If you are in the root, enter the subdirectory
pip install -r requirements.txt
Start the Flask server:
python app.py
The backend will listen on http://localhost:6201.
Step 3: Frontend Setup (Terminal 2)
Open a new terminal window. Navigate to the frontend directory and install Node.js dependencies:
cd path/to/your/cloned/pdf3md/pdf3md # Adjust path as necessary
npm install
Start the development server:
npm run dev
The frontend will be available at http://localhost:5173.
Convenience Scripts
To simplify manual management, the pdf3md sub-directory contains convenience scripts:
-
./start_server.sh: Starts both frontend and backend simultaneously. -
./stop_server.sh: Stops both services.
Ensure these scripts are executable before use:
chmod +x ./start_server.sh ./stop_server.sh
How to Use PDF3MD
Once your application is running—whether on port 3000 (production) or 5173 (development)—using the tool is intuitive.
Converting PDF to Markdown
-
Upload: Open the application. Drag and drop one or multiple PDF files into the designated upload zone, or click to select files from your file system. -
Monitor: Watch the real-time progress bars as the application processes each file. -
Export: Once conversion is complete, the Markdown text is displayed in the interface. Click the Copy button to transfer the text to your clipboard.
Converting Markdown to Word
-
Switch Mode: Locate and toggle the “MD → Word” mode within the application interface. -
Input: Paste or type your Markdown content into the provided text area. -
Download: Click the Download as Word button. The application processes the text using Pandoc and generates a .docxfile for you to download.
Advanced Configuration: Networking and Reverse Proxy
For production deployments, simply running on localhost is often insufficient. You may need to serve the application via a domain name or expose it to your Local Area Network (LAN).
Network Configuration Assumptions
The Docker Compose setup is designed with specific networking behaviors:
-
Same Host Access: When accessed from the host machine, the frontend at http://localhost:3000attempts to reach the backend athttp://localhost:6201. -
LAN Access: If accessed from another device (e.g., http://192.168.1.100:3000), the frontend attempts to connect to the backend athttp://192.168.1.100:6201.-
Crucial Note: This requires your host’s firewall to allow incoming connections on port 6201 from other LAN devices.
-
Using a Reverse Proxy (e.g., Nginx)
If you are deploying behind a web server like Nginx, Apache, Caddy, or Traefik, you must route traffic correctly. The frontend is designed to detect domain usage and will make API requests to /api/... on that domain. Your proxy must strip /api and forward the request to the backend.
Nginx Configuration Example
Assume your domain is http://pdf3md.local/. You need two location blocks:
-
Route /to the frontend service (port 3000). -
Route /api/to the backend service (port 6201).
location / {
proxy_pass http://localhost:3000/;
# Standard proxy headers
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /api/ {
# Note the trailing slash on proxy_pass, which strips the '/api' prefix
proxy_pass http://localhost:6201/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
Key Proxy Considerations:
-
Path Stripping: Ensure the proxy_passdirective handles the/apiremoval correctly. -
SSL/TLS: It is highly recommended to configure SSL termination at your reverse proxy for secure data transmission. -
CORS: The backend includes permissive CORS headers ( Access-Control-Allow-Origin: *). However, if you encounter issues, verify that your proxy is not stripping these headers.
Troubleshooting and FAQs
Even with the best documentation, issues can arise. Here are solutions to the most common questions and problems encountered when deploying PDF3MD.
How do I resolve port conflicts?
Ensure ports 3000 (frontend), 5173 (dev frontend), and 6201 (backend) are not occupied by other applications. If you suspect existing PDF3MD containers are holding the ports, run:
docker compose down
This stops and removes the containers, freeing the ports.
Why do the manual setup scripts fail?
If start_server.sh or stop_server.sh fail to execute, it is almost certainly a permissions issue. Fix this by making the scripts executable:
chmod +x pdf3md/start_server.sh pdf3md/stop_server.sh
What if Docker images won’t start?
First, verify that the Docker Engine is running. If it is, try rebuilding the images to ensure you have the latest layers and no cached corruption:
docker compose up --build
How can I verify backend connectivity?
If the frontend loads but conversions fail, the backend might be unreachable. Check the browser console for errors. Ensure the Flask backend is running and listening on port 6201. You can test this directly in your browser by navigating to http://localhost:6201/.
Can I change the default backend port?
Yes. The backend defaults to port 6201, configurable in pdf3md/app.py. If you change this, you must also update the docker-compose.yml port mapping and the frontend’s Vite proxy configuration in pdf3md/vite.config.js.
How do I configure the timezone?
By default, the Docker container is set to America/Chicago. You can change this in the docker-compose.yml file by modifying the TZ environment variable to your specific region (e.g., America/New_York or Europe/London).
Licensing and Legal Compliance
PDF3MD is distributed under a dual-licensing model. It is vital to understand which license applies to your usage.
-
GNU Affero General Public License v3.0 (AGPLv3): This is the open-source license. You are free to use, study, modify, and distribute the software. However, if you run a modified version on a network server and provide users access to its functionality, you must also make the source code of your modified version available to them under the AGPLv3. You must include a LICENSEfile in your repository with the full AGPLv3 text. -
Commercial License: If you cannot or do not wish to comply with the AGPLv3 requirements (for instance, integrating PDF3MD into a proprietary commercial product without open-sourcing your modifications), a commercial license is available. You must contact the PDF3MD maintainers to obtain this license.
You must explicitly choose one of these licenses to use the software legally. Absence of a commercial license agreement implies strict adherence to AGPLv3 terms.
Contributing to the Project
While the project currently values feedback, bug reports, and feature suggestions via GitHub Issues, future code contributions from external developers will require signing a Contributor License Agreement (CLA). This ensures the maintainers have the rights to distribute contributions under both licensing models.
Conclusion
PDF3MD stands out as a professional, efficient, and user-friendly solution for the common yet complex problem of document conversion. By leveraging the power of Docker for easy deployment or manual setup for customization, it fits into any development workflow. Whether you need to extract Markdown from PDFs for a documentation site or generate Word reports from Markdown, the combination of a React frontend and Flask backend ensures a smooth, high-performance experience.
By following the steps outlined in this guide—configuring your docker-compose.yml, setting up your reverse proxy, and understanding the licensing requirements—you can deploy a robust document conversion service in minutes. Happy converting!
