The Future of Large Files in Git is Git

If Git had an arch-enemy, it would undoubtedly be large files. These unwieldy digital behemoths cause all sorts of headaches: they bloat Git’s storage, slow down the git clone command to a crawl, and create all kinds of problems for the platforms that host Git repositories (known as Git forges).

Back in 2015, GitHub tried to solve this problem by releasing Git LFS—a special extension for Git that worked around the issues caused by large files. But while Git LFS helped, it also introduced new complications and added extra storage costs.

Meanwhile, the team behind the Git project itself has been quietly working on a better solution for handling large files. And even though Git LFS isn’t going away anytime soon, the latest updates to Git show a clear path forward—a future where, eventually, Git LFS will no longer be necessary.

What You Can Do Today: Replace Git LFS with Git Partial Clone

To understand why Git’s built-in solutions are gaining ground, let’s first look at how Git LFS works. Git LFS stores large files outside of your main repository. When you clone a project that uses Git LFS, you get the repository’s history and all the small files right away, but the large files are left out. Instead, Git LFS only downloads the large files you actually need for the version of the project you’re working on (your “working copy”).

In 2017, the Git project introduced a feature called partial clones that offers the same basic benefits as Git LFS. As the official Git documentation explains: “Partial clone allows us to avoid downloading [large binary assets] in advance during clone and fetch operations and thereby reduce download times and disk usage.”

Both Git’s partial clone and Git LFS deliver three key advantages:

Smaller checkouts: When you clone a repository, you only get the most recent version of large files, not every single version that’s ever existed in the project’s history.
Faster clones: By skipping large files during the initial clone, the process is much quicker.
Easy setup: Unlike “shallow clones” (which only download part of the project’s history), partial clones give you the entire history of the project. This means you can start working right away without missing important context.

What Exactly Is a Partial Clone?

A Git partial clone is simply a clone operation that uses a --filter option. This filter tells Git which files to skip during the initial download.

For example, if you want to avoid downloading any files larger than 100KB, you’d use this command:

git clone --filter='blobs:size=100k' <repo-url>

Later, if you need any of those large files (the ones over 100KB), Git will automatically download them “lazily”—meaning it only gets them when you actually need them for your work.

The Difference a Partial Clone Makes: A Real Example

To see why this matters, let’s look at a real scenario. Suppose there’s a repository with many different versions of a 25MB PNG image. If you clone this repository the usual way (without any filters), the process is slow, and the resulting folder takes up a lot of space:

$ time git clone https://github.com/thcipriani/noise-over-git
Cloning into '/tmp/noise-over-git'...
...
Receiving objects: 100% (153/153), 1.19 GiB

real    3m49.052s

In this case, it took almost four minutes just to clone a repository that’s mostly a single 25MB file! And the storage usage is even more frustrating:

$ du --max-depth=0 --human-readable noise-over-git/.
1.3G    noise-over-git/.

Why 1.3GB? Because there are 50 different versions of that 25MB PNG, and Git stores each version separately. That’s a lot of wasted space for a single image.

But with a partial clone, things get much better. Let’s set up a simple shortcut (an alias) to make partial clones easier. We’ll call it pclone and set it to skip files larger than 100KB by default:

$ git config --global alias.pclone 'clone --filter=blob:limit=100k'

Now, using this alias to clone the same repository:

$ time git pclone https://github.com/thcipriani/noise-over-git
Cloning into '/tmp/noise-over-git'...
...
Receiving objects: 100% (1/1), 24.03 MiB

real    0m6.132s

The clone time dropped from almost four minutes to just 6 seconds—a 97% improvement! And the storage usage is dramatically better too:

$ du --max-depth=0 --human-readable noise-over-git/.
49M     noise-over-git/.

That’s 49MB instead of 1.3GB—a 96% reduction in size. And this matches the size you’d get with a Git LFS checkout. Not bad for a built-in Git feature!

Are There Any Catch?

Like most solutions, partial clones have a few caveats. If you run a command that needs a file you filtered out (like git diff to compare two versions, git blame to see who changed a file, or git checkout to switch to an older version), Git will need to connect to the server to download that file.

But here’s the thing: this is exactly how Git LFS works too. And let’s be honest—when was the last time you needed to run git blame on a PNG image? For most large files (like images, videos, or big datasets), you rarely need to dig into their history, so this extra step is rarely an issue.

Why Bother? What’s Wrong with Git LFS?

Git LFS was a good idea when it came out, but it solves Git’s large file problems by passing the headaches on to users. Let’s break down the most significant issues:

1. High Vendor Lock-In

When GitHub created Git LFS, there were other tools for handling large files in Git—like Git Fat, Git Annex, and Git Media. What made these different was that they worked with any server. But GitHub designed Git LFS to work best with their own proprietary server setup, and they charged users to use it.

Over time, other Git forges (like GitLab) built their own LFS servers, and you can now use tools to push to multiple servers or use a special “transfer agent” to work around the lock-in. But all of this makes things more complicated for people contributing to your project. Unless you put in extra work to set things up, you’re effectively stuck with whichever platform you started using for LFS.

2. It’s Costly

GitHub became popular because it let people host repositories for free, but Git LFS started as a paid feature. These days, there’s a free tier, but you’re at the mercy of GitHub (or whatever platform you use) when it comes to pricing.

For example, storing a 50GB repository with Git LFS on GitHub costs around $40 p erye a r . C o m p a re t ha tt o A ma zo n ’ s S 3 s t an d a r d s t or a g e, w h ere 50 GBw o u l d cos t ab o u t$ 13 per year. Over time, those costs add up—especially for projects with lots of large files.

3. Hard to Undo

Once you start using Git LFS in a repository, there’s no easy way to stop. If you decide you want to go back to storing large files directly in Git, you’d have to rewrite the repository’s entire history. This is risky because it can confuse collaborators, break old links, and make it hard to reference previous versions of the project.

4. Ongoing Setup Hassles

For a project using Git LFS, every person who works on it needs to install Git LFS on their computer. If they don’t, something confusing happens: instead of getting the actual large files, they get small text files filled with metadata (like pointers to where the real files are stored). This leads to confusion, wasted time, and frustration—especially for new contributors who might not know about the LFS requirement.

The Future: Git Large Object Promisors

Large files aren’t just a problem for users—they’re a headache for Git forges too. Platforms like GitHub and GitLab limit how big individual files can be (usually 100MB) because storing and serving large files costs more money. Git LFS helps reduce these costs by moving large files to content delivery networks (CDNs), but it shifts the work to users.

The Git project has a new solution in the works: large object promisors. These aim to give servers the same benefits as Git LFS but without making users jump through hoops. As the official documentation puts it: “This effort aims to especially improve things on the server side, and especially for large blobs that are already compressed in a binary format. This effort aims to provide an alternative to Git LFS.”

What Is a Large Object Promisor?

In simple terms, large object promisors are special Git “remotes” (servers) that are designed to store only large files. They work behind the scenes to handle the heavy lifting of managing large files, so users don’t have to do anything extra.

How Will It Work in the Future?

Here’s what the future could look like with large object promisors:

You push a large file to your usual Git host (like GitHub or GitLab).
In the background, your Git host automatically moves that large file to a large object promisor.
When someone clones the repository, the Git host tells their Git client about the promisor.
The client clones the main part of the repository from the Git host, and whenever it needs a large file, it automatically fetches it from the promisor—no extra setup required.

It’s Still a Work in Progress

We’re not quite there yet. Git’s large object promisors are still being developed. Some parts of the feature were added to Git in March 2025, but there’s more work to do. For example, GitLab has ongoing projects to implement and refine the feature, and there are still open questions about how to handle certain edge cases.

For now, if you’re dealing with really large files (bigger than the 100MB limit most platforms set), you’re still stuck using Git LFS. But once large object promisors are widely adopted, maybe GitHub and other platforms will lift those limits—letting you push files bigger than 100MB without needing special tools.