Understanding Fooocus: An Open-Source Tool for Image Generation Based on Stable Diffusion XL

Have you ever wondered how to create stunning images from simple text descriptions without getting bogged down in technical settings? Fooocus is a software tool that makes this possible. It’s built on the Stable Diffusion XL framework and focuses on ease of use. As someone who works with technology and content creation, I find Fooocus appealing because it lets users concentrate on their ideas rather than complicated adjustments. In this post, we’ll explore what Fooocus offers, how to set it up, and its various features. Whether you’re a designer, hobbyist, or just curious about AI image tools, this guide will walk you through the essentials in a straightforward way.

Fooocus rethinks how image generators work. It’s offline, open source, and free, much like some online services but without the need for constant tweaks. You can generate images with just a few clicks after download, and it runs on hardware with at least 4GB of Nvidia GPU memory. Be cautious, though—there are fake sites popping up when searching for “Fooocus.” Stick to official sources to avoid issues.

What Makes Fooocus Stand Out in Image Generation?

At its core, Fooocus simplifies creating images from text. It uses an internal system based on GPT-2 to process prompts, ensuring good results even if your description is brief, like “house in garden,” or detailed up to 1000 words. This means you don’t need expert knowledge in prompt engineering or parameter tuning to get high-quality outputs.

The project is now in a limited long-term support phase, meaning updates will only fix bugs. The features are solid, thanks to contributions like those from mashb1t. There’s no plan to switch to newer models like Flux right now, but that could change with community developments. If you’re interested in newer options, consider tools like WebUI Forge or ComfyUI/SwarmUI. Several forks of Fooocus also exist for those wanting to experiment.

Fooocus is strictly non-commercial and offline. It has no official websites like fooocus.com or similar domains—those are not connected to the project.

Exploring Fooocus Features Through Comparisons

To understand Fooocus better, let’s look at how it matches up with popular tools like Midjourney and LeonardoAI. These comparisons highlight its strengths in text-to-image generation and editing.

Here’s a table comparing Fooocus to Midjourney features:

Midjourney Feature Fooocus Equivalent
High-quality text-to-image without much prompt engineering or tuning (method unknown) High-quality text-to-image without much prompt engineering or tuning (uses an offline GPT-2 prompt processor and sampling enhancements for beautiful results from short or long prompts)
V1 V2 V3 V4 Input Image -> Upscale or Variation -> Vary (Subtle) / Vary (Strong)
U1 U2 U3 U4 Input Image -> Upscale or Variation -> Upscale (1.5x) / Upscale (2x)
Inpaint / Up / Down / Left / Right (Pan) Input Image -> Inpaint or Outpaint -> Inpaint / Up / Down / Left / Right (uses custom inpaint algorithm and models for better results than standard SDXL methods)
Image Prompt Input Image -> Image Prompt (custom algorithm for better quality and prompt understanding than standard SDXL like IP-Adapters or Revisions)
–style Advanced -> Style
–stylize Advanced -> Advanced -> Guidance
–niji Multiple launchers: “run.bat”, “run_anime.bat”, “run_realistic.bat” (supports SDXL models from Civitai)
–quality Advanced -> Quality
–repeat Advanced -> Image Number
Multi Prompts (::) Use multiple lines of prompts
Prompt Weights Use “I am (happy:1.5)”. Applies A1111’s reweighting for better results than ComfyUI when copying from Civitai; for embeddings, “(embedding:file_name:1.1)”
–no Advanced -> Negative Prompt
–ar Advanced -> Aspect Ratios
InsightFace Input Image -> Image Prompt -> Advanced -> FaceSwap
Describe Input Image -> Describe

And for LeonardoAI:

LeonardoAI Feature Fooocus Equivalent
Prompt Magic Advanced -> Style -> Fooocus V2
Advanced Sampler Parameters (like Contrast/Sharpness/etc) Advanced -> Advanced -> Sampling Sharpness / etc
User-friendly ControlNets Input Image -> Image Prompt -> Advanced

These tables show how Fooocus adapts common functions into a user-friendly interface. For more advanced options, check the project’s discussion sections.

Fooocus Example Interface

Step-by-Step Guide to Downloading and Installing Fooocus

Getting started with Fooocus is straightforward, especially on Windows. Here’s how to do it across different systems.

Installation on Windows

  1. Download the file from the official release: Fooocus_win64_2-5-0.7z.
  2. Unzip the file.
  3. Run “run.bat”.

On first launch, it downloads models automatically:

  • Default models go to “Fooocus\models\checkpoints” based on presets.
  • For inpainting, it downloads a 1.28GB control model to “Fooocus\models\inpaint\inpaint_v26.fooocus.patch”.

From version 2.1.60, you’ll see “run_anime.bat” and “run_realistic.bat” for different presets, with auto-downloads. Version 2.3.0 allows preset switching in the browser. Add flags like –disable-preset-selection to turn off browser selection or –always-download-new-model for downloads on switch.

If you have models already, copy them to speed things up. If you see errors like “MetadataIncompleteBuffer” or “PytorchStreamReader,” redownload the files.

Testing on a laptop with 16GB RAM and 6GB VRAM (Nvidia 3060) shows about 1.35 seconds per iteration. If slow, try Nvidia Driver 531 for laptops or desktops.

Minimum setup: 4GB Nvidia VRAM and 8GB RAM. It uses Windows’ Virtual Swap, usually on by default. If you get “RuntimeError: CPUAllocator,” enable it manually and ensure 40GB free space per drive.

Virtual Swap Setup Guide

Using Fooocus on Colab

For cloud-based use (tested up to Aug 12, 2024), open the Colab notebook and modify the command, e.g., !python entry_with_update.py –share –always-high-vram, or add –preset anime for anime mode.

Free Colab limits resources, disabling some features like refiner, but text-to-image works. –always-high-vram optimizes for T4 instances.

Linux Installation with Anaconda

  1. Clone the repo: git clone https://github.com/lllyasviel/Fooocus.git
  2. Navigate: cd Fooocus
  3. Create environment: conda env create -f environment.yaml
  4. Activate: conda activate fooocus
  5. Install requirements: pip install -r requirements_versions.txt

Run with python entry_with_update.py, or –listen for remote access. Use –preset realistic for realistic presets.

Linux with Python Venv

Requires Python 3.10.

  1. Clone and cd as above.
  2. Create venv: python3 -m venv fooocus_env
  3. Activate: source fooocus_env/bin/activate
  4. Install: pip install -r requirements_versions.txt
  5. Run: python entry_with_update.py

Linux Native Python

Similar, but use pip3 and python3 directly.

Linux on AMD GPUs

Uninstall torch packages and reinstall for ROCm: pip install torch torchvision torchaudio –index-url https://download.pytorch.org/whl/rocm5.6. Beta support.

Windows on AMD GPUs

Edit run.bat to uninstall torch, install torch-directml, and add –directml. Beta.

Mac Installation

For M1/M2 on macOS Catalina or later.

  1. Set up conda and PyTorch with MPS support.
  2. Clone and cd.
  3. Create and activate environment.
  4. Install requirements.
  5. Run python entry_with_update.py (some need –disable-offload-from-vram).

Processing is slower without dedicated GPU.

Docker Setup

Refer to docker.md for details.

Getting Older Versions

See the discussion guidelines for downloads.

Hardware Requirements for Running Fooocus

Fooocus prioritizes quality, so hardware matters. Here’s a breakdown:

OS GPU Min GPU Memory Min System Memory Swap Needed Notes
Windows/Linux Nvidia RTX 4XXX 4GB 8GB Yes Fastest
Windows/Linux Nvidia RTX 3XXX 4GB 8GB Yes Faster than RTX 2XXX
Windows/Linux Nvidia RTX 2XXX 4GB 8GB Yes Faster than GTX 1XXX
Windows/Linux Nvidia GTX 1XXX 8GB (6GB uncertain) 8GB Yes Slightly faster than CPU
Windows/Linux Nvidia GTX 9XX 8GB 8GB Yes Varies vs CPU
Windows/Linux Nvidia GTX < 9XX Not supported
Windows AMD GPU 8GB 8GB Yes 3x slower than Nvidia RTX 3XXX via DirectML
Linux AMD GPU 8GB 8GB Yes 1.5x slower than Nvidia RTX 3XXX via ROCm
Mac M1/M2 MPS Shared Shared Shared 9x slower than Nvidia RTX 3XXX
All CPU only 0GB 32GB Yes 17x slower than Nvidia RTX 3XXX

Fooocus won’t use smaller models to lower requirements, focusing on high-quality output.

Default Models and How Presets Work

Presets tailor Fooocus for tasks:

Task Windows Batch Linux Flag Main Model Refiner Config File
General run.bat juggernautXL_v8Rundiffusion Not used default.json
Realistic run_realistic.bat –preset realistic realisticStockPhoto_v20 Not used realistic.json
Anime run_anime.bat –preset anime animaPencilXL_v500 Not used anime.json

Models download automatically or manually.

Preset Switching in Browser

Accessing the UI and Adding Security

Run locally, or use –listen (with –port 8888) for local network access, or –share for public API.

For security, create auth.json with user/pass entries.

Hidden Techniques in Fooocus

Fooocus has clever built-in methods for better results, based on SDXL:

  1. Prompt expansion using GPT-2 as “Fooocus V2” style.
  2. Refiner swap in one sampler, reusing base model data for coherence.
  3. Negative ADM guidance to add contrast in high-resolution levels.
  4. Adapted self-attention guidance to avoid smooth or plastic looks.
  5. Tweaked styles including “cinematic-default.”
  6. LoRA weights under 0.5 often improve over plain XL.
  7. Fine-tuned sampler parameters.
  8. Fixed resolutions for better positional encoding.
  9. No need for separate prompts for encoders.
  10. DPM samplers balance XL’s textures.
  11. System for multiple styles and expansions.
  12. Normalized emphasizing from Automatic1111.
  13. Refiner support for img2img and upscale.
  14. Corrections for CFG over 10.

These enhance output without user effort.

Customizing Your Fooocus Setup

After first run, edit config.txt for paths, defaults like model, LoRAs, CFG scale (3.0), sampler (dpmpp_2m), etc.

Example:

{
    "path_checkpoints": "D:\\Fooocus\\models\\checkpoints",
    "default_model": "realisticStockPhoto_v10.safetensors",
    "default_loras": [["lora_filename_1.safetensors", 0.5], ["lora_filename_2.safetensors", 0.5]],
    "default_styles": ["Fooocus V2", "Fooocus Photograph", "Fooocus Negative"]
}

See tutorial file for more. Delete to reset. Try preset batches for simplicity.

All flags include –listen, –port, –gpu-device-id, etc.

Inline Features for Prompts

Wildcards

Example: color flower – picks random from wildcards/color.txt. Enable ordered reading in debug mode.

Arrays

[[red, green, blue]] flower – generates per element; match image number.

Inline LoRAs

flower lora:sunflowers:1.2 – applies from loras folder.

Forks and Acknowledgments

Forks include Fooocus-Control, RuinedFooocus, Fooocus-MRE, mashb1t’s version.

Thanks to contributors for styles, Canvas Zoom. Based on Stable Diffusion WebUI and ComfyUI.

Updates in update_log.md.

Making Fooocus Multilingual

Add JSON files to language folder, e.g., for translations.

Use –language flag. Default loads if present.

Frequently Asked Questions About Fooocus

What models does Fooocus support?

Stable Diffusion XL-based, including Civitai SDXL. Defaults like juggernautXL_v8Rundiffusion.

How to fix corrupted model files?

Redownload if errors appear.

Can Fooocus run on basic hardware?

Check requirements table; below specs may not work.

Dealing with memory errors?

Enable swap, free space.

Is there a mobile version?

No, but apps like Draw Things support related features.

Switching presets in browser?

Supported from 2.3.0, with potential reloads.

Is Fooocus for business use?

No, fully non-commercial and open source.

For troubleshooting, see troubleshoot.md.

This overview gives a complete picture of Fooocus, from basics to advanced tweaks. It’s a tool that balances simplicity and power, ideal for creative work. If something isn’t clear, revisit the sections—it’s all about practical use.