Step-by-Step Guide to Fine-Tuning Your Own LLM on Windows 10 Using CPU Only with LLaMA-Factory
Introduction
Large Language Models (LLMs) have revolutionized AI applications, but accessing GPU resources for fine-tuning remains a barrier for many developers. This guide provides a detailed walkthrough for fine-tuning LLMs using only a CPU on Windows 10 with LLaMA-Factory 0.9.2. Whether you’re customizing models for niche tasks or experimenting with lightweight AI solutions, this tutorial ensures accessibility without compromising technical rigor.
Prerequisites and Setup
1. Install Python 3.12.9
Download the latest Python 3.12.9 installer from the official website. After installation, clear Python’s cache (optional):
pip cache purge
2. Create a Project Directory
Organize your workspace on a drive (e.g., Drive D:):
D:
mkdir lafa
3. Clone and Install LLaMA-Factory
Use Git to clone the repository and install dependencies:
cd D:\lafa
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"
Dataset Preparation and Configuration
4. Set Up Custom Datasets
Create a datasets_mymodels
folder in D:\lafa
and include these mandatory files (available in LLaMA-Factory’s data
folder):
-
identity.json
-
dataset_info.json
-
c4_demo.jsonl
5. Configure Default Paths
Edit LLaMA-Factory\src\llamafactory\webui\common.py
to define your directories:
DEFAULT_DATA_DIR = "D:/lafa/datasets_mymodels" # Dataset storage
DEFAULT_SAVE_DIR = "D:/lafa/lafa_llms_created" # Model output
6. Install Dependencies
Navigate to the LLaMA-Factory root folder and run:
pip install -r requirements.txt
CPU-Specific Adjustments
7. Install CPU-Compatible PyTorch
Replace GPU-dependent PyTorch with the CPU version:
pip uninstall -y torch torchvision torchaudio
pip install torch==2.2.2+cpu torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cpu
8. Modify Training Scripts
Edit LLaMA-Factory\src\train.py
to enforce CPU usage by adding this line before def main():
:
device = torch.device('cpu')
9. Adjust WebUI Launch Parameters
In LLaMA-Factory\src\llamafactory\webui\runner.py
, update the Popen
command to include shell=True
:
self.trainer = Popen(["llamafactory-cli", "train", save_cmd(args)], env=env, shell=True)
Launching Training and Monitoring
10. Start the WebUI Interface
From LLaMA-Factory\src
, run:
webui --device cpu
Ignore the “CUDA environment not detected” warning. Monitor real-time progress in the command prompt.
11. Troubleshooting Common Issues
-
Training Fails to Start: Verify llamafactory-cli
installation. Re-clone the repository if needed. -
Path Errors: Ensure slashes in common.py
use/
format (e.g.,D:/lafa
).
Model Export and Format Conversion
12. Export as Safetensors
Trained models are saved to LLaMA-Factory\src\saves\Custom\lora\
by default. To customize the output path, update DEFAULT_SAVE_DIR
in common.py
.
13. Merge Adapter with Base Model
Copy a configuration file (e.g., qwen2vl_lora.sft.yaml
from LLaMA-Factory\examples\train_lora
) to your model folder. Modify these entries:
model_name_or_path: "Path_to_HuggingFace_Base_Model"
adapter_name_or_path: "Path_to_Fine-Tuned_Adapter"
export_dir: "Output_Path_for_Merged_Model"
Run the export command:
llamafactory-cli export [your_config_file.yaml]
14. Convert to GGUF Format
Install llama.cpp and execute:
python llama.cpp/convert_hf_to_gguf.py [input_model_path] --outfile [output.gguf] --outtype q8_0
Note: If conversion fails, validate the config.json
file for architecture compatibility.
Model Testing and Deployment
15. Load GGUF into LM Studio
Copy the GGUF file to LM Studio’s model directory (e.g., D:\llm_for_lmstudio\lmstudio_models
). The model will appear under “My Models” upon relaunching the software.
16. Validate Model Performance
Test domain-specific knowledge by querying the model. For example, if fine-tuned for medical QA, compare responses before and after training.
Conclusion
This guide demonstrates that CPU-based LLM fine-tuning is not only feasible but also practical for resource-constrained environments. Key takeaways:
-
Precision in Configuration: Path formatting and dependency versions are critical. -
Iterative Validation: Test workflows with small datasets before scaling. -
Future-Proofing: Monitor updates to LLaMA-Factory and llama.cpp for efficiency improvements.