PDF3MD: Deploy a Local Docker Web App for PDF to Markdown & Word Conversion

高效码农

2 months ago

PDF3MD converts PDFs to Markdown and Markdown to Word via a local Docker web app.

The Ultimate Guide to PDF3MD: Deploying a Professional PDF to Markdown and Word Converter

In the modern workflow of content creation and documentation, the ability to fluidly move between PDF documents, Markdown text, and Microsoft Word files is invaluable. Whether you are archiving technical papers, preparing documentation for a static site generator, or drafting reports in a collaborative environment, the rigidity of file formats can often halt productivity.
Enter PDF3MD, a robust web application designed specifically for high-fidelity document conversion. Built with a modern technology stack comprising a React-based frontend and a Python Flask backend, PDF3MD offers a seamless user experience. It doesn’t just convert files; it provides a comprehensive interface with real-time progress tracking, multi-file processing, and a drag-and-drop user experience.
This guide serves as a definitive resource for understanding, deploying, and optimizing PDF3MD. We will walk through every aspect of the application, from the underlying technology to detailed deployment strategies using Docker and manual setup, ensuring you have the expertise to integrate this tool into your production environment.

Core Capabilities and Technical Architecture

Before diving into the installation, it is crucial to understand what makes PDF3MD tick. The application is engineered to handle two primary conversion tasks with high precision:

PDF to Markdown Conversion: It transforms complex PDF documents into clean, readable Markdown. This process preserves structural elements, ensuring that headings, paragraphs, and lists remain intact.
Markdown to Word (DOCX) Conversion: It reverses the process, converting user-provided Markdown text into DOCX format. This leverages the power of Pandoc to ensure the output maintains high fidelity to the original styling.

The Technology Stack Behind the Efficiency

The reliability of PDF3MD stems from its carefully selected technology stack:

Frontend: The user interface is built using React and scaffolded with Vite. This combination ensures a highly responsive, modern, and intuitive user interface that feels like a native desktop application.
Backend: The server-side logic is powered by Python using the Flask microframework. This choice allows for lightweight yet powerful handling of file uploads and background processing tasks.
PDF Processing: The core engine for PDF extraction is PyMuPDF4LLM. This library is responsible for parsing PDF content accurately, converting it into structured Markdown without losing formatting nuances.
Markdown to DOCX Conversion: For generating Word documents, the application utilizes Pandoc. Pandoc is the gold standard for document conversion, ensuring that your resulting .docx files are professional and ready for editing.

Key Features That Enhance Productivity

Beyond simple conversion, PDF3MD integrates several features designed for professional workflows:

Multi-File Upload: You are not limited to converting one file at a time. The interface supports uploading and processing multiple PDF files simultaneously, significantly speeding up bulk tasks.
Real-Time Progress Tracking: For each file processed, the interface provides detailed status updates. You no longer have to guess if the conversion is hanging or completed; the progress bars keep you informed.
Drag & Drop Interface: The user experience is streamlined with a drag-and-drop zone, alongside traditional file selection buttons.
File Metadata: Upon completion, the system displays essential information such as the original filename, file size, page count, and the exact timestamp of the conversion.

Deployment Strategy 1: Docker (Recommended)

For most users, especially those aiming for a production-ready environment or a quick local test, utilizing the pre-built Docker images is the most efficient path. This method isolates dependencies and ensures consistency across different machines.

Prerequisites

To proceed with the Docker deployment, ensure your system has Docker Engine and Docker Compose installed. Docker Compose is typically included with Docker Desktop installations.

Step 1: Preparing the Configuration Files

The deployment requires two specific files to be placed in a dedicated directory.

Create a Directory:
Open your terminal and create a clean workspace for the application:
```
mkdir pdf3md-app && cd pdf3md-app
```

Create docker-compose.yml:
Inside this directory, create a file named docker-compose.yml. This YAML file orchestrates the services. Paste the following configuration into it:

services:
  backend:
    image: docker.io/learnedmachine/pdf3md-backend:latest 
    container_name: pdf3md-backend
    ports:
      - "6201:6201"
    environment:
      - PYTHONUNBUFFERED=1
      - FLASK_ENV=production
      - TZ=America/Chicago
    volumes:
      - ./pdf3md/temp:/app/temp # Creates a local temp folder for backend processing if needed
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:6201/"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
  frontend:
    image: docker.io/learnedmachine/pdf3md-frontend:latest 
    container_name: pdf3md-frontend
    ports:
      - "3000:3000"
    depends_on:
      - backend
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
networks:
  default:
    name: pdf3md-network

Configuration Breakdown:

Backend Service: This service maps port 6201 on the host to port 6201 in the container. It sets the environment to production and defines a health check that curls the localhost endpoint every 30 seconds. If the check fails 3 times with a 10-second timeout, the container is marked unhealthy. The restart: unless-stopped policy ensures resilience.
Frontend Service: This maps port 3000 and depends on the backend, ensuring startup order. It uses wget to check its own health on port 3000.
Volumes: The backend mounts a local directory ./pdf3md/temp to /app/temp inside the container, preserving temporary files locally if needed.

Download docker-start.sh:
You need the management script to easily control the application. Download the docker-start.sh script directly from the pdf3md GitHub repository’s main branch and place it in your pdf3md-app directory.

Once downloaded, you must make it executable:
```
chmod +x ./docker-start.sh
```

Step 2: Launching the Application

With the files in place, starting the application is straightforward.

Standard Production Start:
Run the following command in the directory containing your files:
```
./docker-start.sh start
```
This command pulls the latest images from Docker Hub and initializes the services defined in your compose file.
Custom Domain/IP Start:
If you need to access the application from other devices on your Local Area Network (LAN) or via a specific domain:
```
./docker-start.sh start example.com
```
This is particularly useful when the frontend needs to communicate with the backend using a specific hostname visible to other clients on the network.

Step 3: Accessing PDF3MD

Once the containers are up and running, you can access the services via your browser:

Default Settings:
- Frontend Interface: http://localhost:3000
- Backend API: http://localhost:6201
Custom Domain (e.g., using example.com):
- Frontend Interface: http://example.com:3000
- Backend API: http://example.com:6201

Managing the Docker Services

The docker-start.sh script provides several utility commands to manage your deployment lifecycle:

./docker-start.sh status: Check the current status of the running containers.
./docker-start.sh stop: Gracefully stop all services.
./docker-start.sh logs: View the output logs from the frontend and backend, which is essential for debugging.
./docker-start.sh rebuild dev example.com: Rebuild the development environment with a specified domain.
./docker-start.sh help: Display all available command options.

Deployment Strategy 2: Development Mode (Docker with Hot-Reload)

For developers who intend to modify the source code, a development mode is available. This mode utilizes the docker-compose.dev.yml file and enables hot-reloading, meaning changes to your code are reflected immediately without restarting containers.

1. Clone the Repository

Development requires full access to the source code:

git clone https://github.com/murtaza-nasir/pdf3md.git
cd pdf3md

Note that cloning the repository includes the necessary docker-compose.dev.yml file.

2. Start Development Environment

Execute the script with the dev flag:

./docker-start.sh dev

This mounts your local source code into the containers and starts the Vite dev server.

Accessing Dev Mode:
- Frontend (with Hot-Reload): http://localhost:5173
- Backend API: http://localhost:6201
  You can also specify a custom IP for LAN testing in dev mode:

./docker-start.sh dev 192.168.1.100

Deployment Strategy 3: Manual Setup (Without Docker)

If you prefer to run the services natively on your machine or Docker is not an option, you can set up the frontend and backend manually. This path is often chosen for deep debugging or preference for native environments.

Prerequisites

Python 3.8 or higher
Node.js 16 or higher
Git

Step 1: Clone the Repository

git clone https://github.com/murtaza-nasir/pdf3md.git
cd pdf3md

Step 2: Backend Setup (Terminal 1)

Navigate to the backend application directory and install the required Python dependencies:

cd pdf3md # If you are in the root, enter the subdirectory
pip install -r requirements.txt

Start the Flask server:

python app.py

The backend will listen on http://localhost:6201.

Step 3: Frontend Setup (Terminal 2)

Open a new terminal window. Navigate to the frontend directory and install Node.js dependencies:

cd path/to/your/cloned/pdf3md/pdf3md # Adjust path as necessary
npm install

Start the development server:

npm run dev

The frontend will be available at http://localhost:5173.

Convenience Scripts

To simplify manual management, the pdf3md sub-directory contains convenience scripts:

./start_server.sh: Starts both frontend and backend simultaneously.
./stop_server.sh: Stops both services.
Ensure these scripts are executable before use:

chmod +x ./start_server.sh ./stop_server.sh

How to Use PDF3MD

Once your application is running—whether on port 3000 (production) or 5173 (development)—using the tool is intuitive.

Converting PDF to Markdown

Upload: Open the application. Drag and drop one or multiple PDF files into the designated upload zone, or click to select files from your file system.
Monitor: Watch the real-time progress bars as the application processes each file.
Export: Once conversion is complete, the Markdown text is displayed in the interface. Click the Copy button to transfer the text to your clipboard.

Converting Markdown to Word

Switch Mode: Locate and toggle the “MD → Word” mode within the application interface.
Input: Paste or type your Markdown content into the provided text area.
Download: Click the Download as Word button. The application processes the text using Pandoc and generates a .docx file for you to download.

Advanced Configuration: Networking and Reverse Proxy

For production deployments, simply running on localhost is often insufficient. You may need to serve the application via a domain name or expose it to your Local Area Network (LAN).

Network Configuration Assumptions

The Docker Compose setup is designed with specific networking behaviors:

Same Host Access: When accessed from the host machine, the frontend at http://localhost:3000 attempts to reach the backend at http://localhost:6201.
LAN Access: If accessed from another device (e.g., http://192.168.1.100:3000), the frontend attempts to connect to the backend at http://192.168.1.100:6201.
- Crucial Note: This requires your host’s firewall to allow incoming connections on port 6201 from other LAN devices.

Using a Reverse Proxy (e.g., Nginx)

If you are deploying behind a web server like Nginx, Apache, Caddy, or Traefik, you must route traffic correctly. The frontend is designed to detect domain usage and will make API requests to /api/... on that domain. Your proxy must strip /api and forward the request to the backend.
Nginx Configuration Example
Assume your domain is http://pdf3md.local/. You need two location blocks:

Route / to the frontend service (port 3000).
Route /api/ to the backend service (port 6201).

location / {
    proxy_pass http://localhost:3000/;
    # Standard proxy headers
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
}
location /api/ {
    # Note the trailing slash on proxy_pass, which strips the '/api' prefix
    proxy_pass http://localhost:6201/; 
    
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
}

Key Proxy Considerations:

Path Stripping: Ensure the proxy_pass directive handles the /api removal correctly.
SSL/TLS: It is highly recommended to configure SSL termination at your reverse proxy for secure data transmission.
CORS: The backend includes permissive CORS headers (Access-Control-Allow-Origin: *). However, if you encounter issues, verify that your proxy is not stripping these headers.

Troubleshooting and FAQs

Even with the best documentation, issues can arise. Here are solutions to the most common questions and problems encountered when deploying PDF3MD.

How do I resolve port conflicts?

Ensure ports 3000 (frontend), 5173 (dev frontend), and 6201 (backend) are not occupied by other applications. If you suspect existing PDF3MD containers are holding the ports, run:

docker compose down

This stops and removes the containers, freeing the ports.

Why do the manual setup scripts fail?

If start_server.sh or stop_server.sh fail to execute, it is almost certainly a permissions issue. Fix this by making the scripts executable:

chmod +x pdf3md/start_server.sh pdf3md/stop_server.sh

What if Docker images won’t start?

First, verify that the Docker Engine is running. If it is, try rebuilding the images to ensure you have the latest layers and no cached corruption:

docker compose up --build

How can I verify backend connectivity?

If the frontend loads but conversions fail, the backend might be unreachable. Check the browser console for errors. Ensure the Flask backend is running and listening on port 6201. You can test this directly in your browser by navigating to http://localhost:6201/.

Can I change the default backend port?

Yes. The backend defaults to port 6201, configurable in pdf3md/app.py. If you change this, you must also update the docker-compose.yml port mapping and the frontend’s Vite proxy configuration in pdf3md/vite.config.js.

How do I configure the timezone?

By default, the Docker container is set to `America/Chicago`. You can change this in the `docker-compose.yml` file by modifying the `TZ` environment variable to your specific region (e.g., `America/New_York` or `Europe/London`).

Licensing and Legal Compliance

PDF3MD is distributed under a dual-licensing model. It is vital to understand which license applies to your usage.

GNU Affero General Public License v3.0 (AGPLv3): This is the open-source license. You are free to use, study, modify, and distribute the software. However, if you run a modified version on a network server and provide users access to its functionality, you must also make the source code of your modified version available to them under the AGPLv3. You must include a LICENSE file in your repository with the full AGPLv3 text.
Commercial License: If you cannot or do not wish to comply with the AGPLv3 requirements (for instance, integrating PDF3MD into a proprietary commercial product without open-sourcing your modifications), a commercial license is available. You must contact the PDF3MD maintainers to obtain this license.
You must explicitly choose one of these licenses to use the software legally. Absence of a commercial license agreement implies strict adherence to AGPLv3 terms.

Contributing to the Project

While the project currently values feedback, bug reports, and feature suggestions via GitHub Issues, future code contributions from external developers will require signing a Contributor License Agreement (CLA). This ensures the maintainers have the rights to distribute contributions under both licensing models.

Conclusion

PDF3MD stands out as a professional, efficient, and user-friendly solution for the common yet complex problem of document conversion. By leveraging the power of Docker for easy deployment or manual setup for customization, it fits into any development workflow. Whether you need to extract Markdown from PDFs for a documentation site or generate Word reports from Markdown, the combination of a React frontend and Flask backend ensures a smooth, high-performance experience.
By following the steps outlined in this guide—configuring your docker-compose.yml, setting up your reverse proxy, and understanding the licensing requirements—you can deploy a robust document conversion service in minutes. Happy converting!

PDF3MD converts PDFs to Markdown and Markdown to Word via a local Docker web app.

The Ultimate Guide to PDF3MD: Deploying a Professional PDF to Markdown and Word Converter

Core Capabilities and Technical Architecture

The Technology Stack Behind the Efficiency

Key Features That Enhance Productivity

Deployment Strategy 1: Docker (Recommended)

Prerequisites

Step 1: Preparing the Configuration Files

Step 2: Launching the Application

Step 3: Accessing PDF3MD

Managing the Docker Services

Deployment Strategy 2: Development Mode (Docker with Hot-Reload)

1. Clone the Repository

2. Start Development Environment

Deployment Strategy 3: Manual Setup (Without Docker)

Prerequisites

Step 1: Clone the Repository

Step 2: Backend Setup (Terminal 1)

Step 3: Frontend Setup (Terminal 2)

Convenience Scripts

How to Use PDF3MD

Converting PDF to Markdown

Converting Markdown to Word

Advanced Configuration: Networking and Reverse Proxy

Network Configuration Assumptions

Using a Reverse Proxy (e.g., Nginx)

Troubleshooting and FAQs

How do I resolve port conflicts?

Why do the manual setup scripts fail?

What if Docker images won’t start?

How can I verify backend connectivity?

Can I change the default backend port?

How do I configure the timezone?

By default, the Docker container is set to America/Chicago. You can change this in the docker-compose.yml file by modifying the TZ environment variable to your specific region (e.g., America/New_York or Europe/London).

Licensing and Legal Compliance

Contributing to the Project

Conclusion

By default, the Docker container is set to `America/Chicago`. You can change this in the `docker-compose.yml` file by modifying the `TZ` environment variable to your specific region (e.g., `America/New_York` or `Europe/London`).