. ¶
Why Mercurial needs to become a great Git client ¶
For developers who find Git’s complexity overwhelming, Mercurial offers a more approachable alternative with a cleaner and safer interface. Yet despite its advantages, Mercurial faces a fundamental challenge: in today’s development landscape, dominated by Git, GitHub, and GitLab, any version control system must excel at working with Git repositories.
This post explores why Git compatibility is essential for Mercurial’s future, examines the current state of Git integration, and proposes a technical path forward.
The version control dilemma ¶
When teaching version control to non-professional developers, the obvious answer is Git—it’s the industry standard. However, Git’s complexity means many users learn only the bare minimum, leaving them confused when problems arise. While graphical interfaces help, they’re inherently limited in what they can do.
Alternative tools like Mercurial offer superior user experiences. Modern Mercurial, along with newer projects like Sapling and Jujutsu (jj), can be genuinely better than Git for certain workflows. But there’s a catch: developers still need to interact with Git repositories. Until Mercurial becomes excellent at this, it can’t serve as a true alternative for most users.
The Git compatibility imperative ¶
The centrality of Git in modern development cannot be overstated. Any competing version control system must provide robust Git compatibility to be viable. The newer alternatives understand this:
Sapling explicitly integrates with Git: “Sapling client supports cloning and interacting with Git repositories and can be used by individual developers to work with GitHub and other Git hosting services.”
Jujutsu describes itself as “A Git-compatible VCS that is both simple and powerful,” noting that “we use Git repositories as a storage layer to serve and track content, making it compatible with many of your favorite Git-based tools, right now!”
Mercurial’s situation is less promising.
Current solutions: not good enough ¶
Mercurial currently offers two ways to interact with Git repositories, both with significant limitations:
The core extension:
hgext.git
¶
The hgext.git extension ships with Mercurial but requires cloning, pulling, and pushing through Git itself. Mercurial interacts with the local Git repository using the Python package pygit2 .
While recent improvements have been made, the extension remains incompatible with recent
pygit2 versions and severely limited—it doesn’t even implement
hg
status
!
The external extension: hg-git ¶
hg-git
is what I currently use for Git interactions, and it works reasonably well.
Using the Python package
Dulwich
, it maintains a Git repository inside the
.hg
directory.
However, this dual-repository approach creates significant problems:
-
Performance and memory overhead : maintaining two local repositories is inefficient
-
Scaling issues : large repositories expose performance problems
-
Incompatibility with modern Mercurial : hg-git doesn’t work with the topics and evolve extensions that represent Mercurial’s modern workflow
-
Poor branch mapping : Git branches become Mercurial bookmarks rather than Mercurial topics or named branches
-
Unstable conversions : pull results aren’t consistent, preventing effective collaboration on the Mercurial side
-
Maintenance burden : New Mercurial and Dulwich versions frequently break hg-git
These limitations create a vicious cycle: hg-git isn’t competitive, keeping it a niche solution with a small community and limited maintainer motivation. In my view, hg-git is doomed to stay a niche project with maintenance issues.
A path Forward ¶
Currently, no solution exists for seamlessly working with remote Git repositories using modern Mercurial. But there could be a better approach: on-the-fly conversion of Git packfiles .
Instead of maintaining separate repositories, Mercurial could directly consume and create Git’s core data structures used for communications (“packfiles”). This approach could be:
-
Memory efficient : no repository duplication
-
Compatible : works with modern Mercurial features like topics and evolve
-
Fast : direct packfile manipulation avoids conversion overhead
-
Relatively simple : only the exchange commands are impacted (mostly clone, pull and push)
We need a proof of concept to validate this approach, but the potential is clear.
Making it happen ¶
Git repository interaction must become native to modern Mercurial—fast and robust for the standard open-source development cases that matter most.
A moderately long-term goal would be that things like
# note: no hg-git
uv tool install mercurial --with hg-evolve
# reasonable fast compared to Git
hg clone git@github.com:paugier/cpython.git
hg topic my-topic
# pure Mercurial, i.e. fast
hg st
hg commit
# push (Git packfiles)
hg push
# later...
# upstream is git@github.com:python/cpython.git
hg pull upstream
# modern Mercurial using evolve
hg amend
hg rebase
# -f needed because we interact with a Git repository
hg push -f
should natively work for all OS (i.e. Windows wheels should finally become usable—another interesting subject).
The path from concept to reality could start with a Python prototype as a Mercurial extension. An internship project supervised by a Mercurial core developer could kickstart development. Modern tools like large language models could accelerate prototype implementation.
For Mercurial to thrive as more than a niche tool, it must become an excellent Git client. The technical foundation seems within reach; what’s needed now is commitment and execution.
I (Pierre Augier) would be motivated to help develop this prototype. The goal would be to implement enough functionality to determine whether this packfile-based approach is truly viable—or whether Mercurial’s Git integration should continue to be based on hg-git. A working proof of concept would give us the data needed to make an informed decision about Mercurial’s future direction.
Appendix: A technical summary about the packfile-based approach ¶
1. Git Objects and Packfiles ¶
Git Objects ¶
-
Blob : Represents file contents (raw data).
-
Tree : Represents directories (lists of blobs and subtrees).
-
Commit : Points to a tree, parent commits, author/committer info, and a message.
-
Tag : A named reference to a specific commit.
Packfiles ¶
-
What they are : Compressed binary files storing Git objects (blobs, trees, commits, tags) using delta encoding.
-
Purpose : Save space and improve performance during clone/pull/push.
-
Location : Stored in
.git/objects/pack/as.pack(data) and.idx(index) files. -
When created : During
clone,pull,fetch, andpush. Packfiles are not unpacked by default —Git reads objects directly from them.
2. Mercurial Objects and Storage ¶
Mercurial Objects ¶
-
File Revision : Represents file contents at a specific revision (stored in a revlog).
-
Manifest : Represents the state of the entire repository (all files and directories) at a revision.
-
Changeset : Represents a commit, pointing to a manifest, parent changesets, author info, and a message.
Revlogs ¶
-
What they are : Files storing all revisions of a file, manifest, or changelog.
-
Purpose : Efficient storage and access to historical data.
-
Location : Stored in
.hg/store/as.i(index) and.d(data) files.
3. Branches in Git and Mercurial ¶
Git Branches ¶
-
Representation : Lightweight pointers to commits, stored in
.git/refs/heads/or.git/packed-refs. -
Default branch : Typically
mainormaster, specified in.git/HEAD. -
Feature branches : Short-lived branches for features or fixes.
Mercurial Branches ¶
-
Named Branches : Permanent branches, part of changeset metadata (e.g.,
main). -
Topics : Lightweight, short-lived branches (similar to Git feature branches).
-
Bookmarks : Movable pointers to changesets (similar to Git branches, but not part of metadata).
4. Mapping Git to Mercurial ¶
|
Git Concept |
Mercurial Equivalent |
Notes |
|---|---|---|
|
Blob |
File Revision |
Both represent file contents. Mercurial stores all revisions in a revlog. |
|
Tree |
Manifest |
Git trees represent directories; Mercurial manifests represent the entire repository. |
|
Commit |
Changeset |
Both represent commits. Mercurial changesets point to manifests. |
|
Tag |
Bookmark/Tag |
Mercurial uses bookmarks (mutable) and tags (immutable) for named references. |
|
Default branch |
Named Branch |
Use named branches for long-lived branches like
|
|
Feature branches |
Topics |
Use topics for short-lived feature branches. |
|
Remote branches |
Bookmarks |
Use bookmarks to track remote branches. |
5. Packfiles and Refs in Practice ¶
Packfiles ¶
-
Saved to disk : Always saved in
.git/objects/pack/duringclone,pull, andpush. -
Not unpacked : Git reads objects directly from packfiles using the index.
6. Conversion Process ¶
Steps ¶
-
Extract Git objects : Read packfiles to extract blobs, trees, commits, and tags.
-
Map objects : Convert Git objects to Mercurial objects (blobs → file revisions, trees → manifests, commits → changesets).
-
Map branches : Convert Git branches to Mercurial named branches/topics/bookmarks.
-
Update Mercurial repo : Write converted objects to Mercurial’s revlogs and update refs.
Challenges ¶
-
Delta reconstruction : Git packfiles use delta compression; reconstruct full objects before conversion.
-
Mapping maintenance : Maintain a bidirectional mapping between Git SHA-1s and Mercurial node IDs.
-
Branch handling : Map Git branches to Mercurial branches/topics/bookmarks.
7. Example Workflow ¶
Cloning a Git Repository ¶
-
Clone : Fetch Git packfiles and refs.
-
Convert objects : Extract objects from packfiles and convert to Mercurial format.
-
Create branches : Map Git branches to Mercurial named branches/topics.
-
Update repo : Write objects to Mercurial revlogs and update refs.
Pulling Updates ¶
-
Negotiate : Compare local and remote refs to determine missing objects.
-
Fetch packfile : Download new packfile from Git server.
-
Convert objects : Extract and convert new objects.
-
Update branches : Update Mercurial branches/topics to match Git refs.
8. Tools and Libraries ¶
|
Tool/Library |
Purpose |
|---|---|
|
Dulwich |
Full Git implementation in Python. Useful for handling Git objects and packfiles. |
|
GitPython |
Python library for interacting with Git repos (requires Git CLI). |
|
gitdb |
Pure-Python library for reading Git objects and packfiles. |
|
Mercurial API |
Use Mercurial’s internal APIs to write revlogs, manifests, and changesets. |
9. Key Takeaways ¶
-
Packfiles are central : Git uses packfiles for efficient storage and transfer of objects.
-
Refs are separate : Branch and tag names are stored separately from packfiles.
-
Mapping is critical : Maintain a mapping between Git SHA-1s and Mercurial node IDs for incremental updates.
-
Topics for features : Use Mercurial topics for Git feature branches and named branches for the default branch.