Bazaar, Git or Mercurial? Some thoughts

It has been a few years since I’ve learned about DVCS (Distributed Version Control Software), and there are always some battles between the three main containders. The Centralized VCS war was won by Subversion, but the DVCS is far from over. I had the chance to use those three tools for work, free-time and open source projects. I do not claim that my time using them is enough to have a solid conclusion, but for me, there is a clear winner.

Bazaar

Bazaar was the first DVCS I’ve tried. It has a Window integration, as well as several standalone GUI. Integration in Eclipse is rudimentary at best last time I checked.

For Bazaar, each repository can only have one branch. So when you are working on different things, you must have as many local checkouts as your features. Not very efficient when you have several tickets assigned to you on different versions of a product.

Nevertheless, was a good start at understanding a DVCS. I suppose it will never get the momentum of git or Mercurial because of some limitations (branches but also the different repository model changes).

Git

Git is the latest DVCS I’ve tried for different scikits (learn and optimization). The default GUI available on Linux is something based on Tk, but it’s not very usable (I didn’t find a way of saying that I want to include some changes and not others). I never tried Git for Aclipse, but I suppose it may be better.

When you pull changes from a remote repository, you can’t get everything, especially if your local repository is not a fork of this peculiar repository. You have to checkout the branch you want to pull and then pull it. Then, it will not be directly visible, and if you have local changes, you have to merge the remote branch inside your repository. Indeed, Git does not allow several branches inside one branch.

Git allows you to “cherry-pick” changesets to apply them somewhere else. It’s a nice feature, I prefer the bundle one. I’ve used it with Mercurial, ut’s especially useful if you can’t have access to the main repository. With contractors, this happens all the time, so you send the repository to your contractor, they make some changes, and bundle them in one small file that they send you. You can then pull it inside your repository.

Git can also do N-way merges. You take several branches or repository and you can merge everything in one click. I don’t find this very smart, because you won’t be able to find which merge was problematic if everything fallas apart. I’ve already seen havoc happen in several 2-way merges, I was glad to be able to find the guilty changeset/merge.

Mercurial

Mercurial is the second DVCS I’ve tried, this time only for my work. I’ve never used bitbucket, but it seems very neat. There also several GUIs, for only exploring the repository like hgview (very cool BTW) or for integration inside Eclipse (Mercurial Eclipse does a really good every-day job for me).

I tried MErcurial because it allows you to change the current branch you’re working with, and also you can have several heads for your branch (they are called unnamed branches I think). Although this adds a level of complecity, I don’t always want to create a new branch for a small task (like I’ve created a branch for my contractors, but I’ve added some modifications to their code, I won’t create a new branch for that), but I want to keep things in separate branches.

Mercurial can also create bundles, the only drawback is that you may do see what is inside aside for hg incoming, but it will only display what is only available in the bundle, it’s not near enough (I don’t know if git is capable of doing this). It seems to be possible to cherry-pick changset, but I never had to do this.

Conclusion

A few days ago, I really had troubles with git, troubles I would not have with Mercurial. I tried to pull changes, but I could get them inside my master. No error messages, nothing. With Mercurial, I would have got 2 heads, and I would have merged them afterwards (there is a fetch command to pull and merge, but you seriously don’t want to do that with code you don’t know).

IMHO, Git is very powerful, but also too powerful. It allows you to do anything, and you have to keep in mind that you have all this power and that you may bring doom to your project. With Mercurial, it’s far more simple and a command does what is expected to do (hg pull pulls all changsets from a remote repository inside yours, git doesn’t do that). You may see some pages on the Internet about features Mercurial lacks, but with a standard DVCS workflow, nothing is missing.

Will Git win the DVCS war? I don’t think so. In fact, there may never be a clear winner. Git will be the favorite tool for hackers, but developers like me will favor Mercurial. Perhaps the main point is more that github has some nice features that bitbucket may (or not) lack? Yesterday, I talked with a colleague of mine about SVN, Git and Mercurial. We agreed on Mercurial as an efficient and easy-to-use VCS.

Some additional links I do not always agree with (some rants are just a workflow issue they have on their own project):

18 thoughts on “Bazaar, Git or Mercurial? Some thoughts”

  1. I don’t know Mercurial, but if I understand it well, “hg pull” downloads the changes (leading to possibly several heads), which you would “hg merge” afterwards, is that it ? And “hg fetch” == “hg pull ; hg merge”…
    Well, then I understand your git problems better, because with git, it’s exactly the opposite ! “git fetch” downloads the changes, leading to several branches (what you call ‘heads’, I guess… it seems that ‘head’ in git also has a different meaning than in hg): for example, your master branch, and a remote branch origin/master. And “git pull” == “git fetch ; git merge” !
    Actually, I never use git pull, and find it a very misleading command, because 1° as you said, you don’t seriously want to merge something that you just fetched (or pulled :p), and 2° you may want to use “git rebase” instead of “git merge” to avoid messy history (you could use “git pull –rebase”, but 1° is a good reason in itself). So my personal recommendation: never ever use “git pull”, unless you’re aware that it’s a shortcut for “git fetch ; git merge”.
    One more comment: when you “git fetch”, you *do get everything* from remote repository. All remote branches are updated (but of course, not merged, so that’s maybe why you didn’t see them). Just run “git branch -a” to list all branches including remote ones, and “gitk –all” to see remote branches in gitk.

    Btw, personal message: ça fait un bail ! Lycée Kléber, il y a 10 ans !

    1. François > Thanks for the needed precisions. I see that although we do not share the same DVCS, we do quite agree on cautiousness 😉

      And nice to have a message from you 😉 Indeed, 10 years!

    1. Timmie > It’s not an issue 😉 It’s a fact. Bazaar at its core wasn’t meant to several branches sharing a repository. OK, it will be fixed, and it may also imply a new repository upgrade, something Mercurial and Git do not need. I still see Bazaar as a DVCS that is trying to catch up, and it may already be too far away.

  2. I am not sure what are your concerns with GIT, especially described in the 2nd paragraph of GIT section…

    GIT fetch fetches EVERYTHING from the remote repository which is connected to any remote branch (i.e. if there is just a tag sitting not within any branch, you would need to add -t to force fetching those orphans; but that situation is not common, neither encouraged to appear). For “directly visible”: “git branch -a” would show all those remote branches, and “gitk –all” would allow you to traverse/explore/filter/… all local and remote branches, even if they had no common ancestor (which is what you meant with “is not a fork of this peculiar repository”).

    gitk GUI is indeed somewhat ugly, but heck very functional and allows for filtering/search/whatever — just look into View -> Edit View for how you could filter out your “View” to meet your “include some changes and not others” requirement.

    “Git does not allow several branches inside one branch.” requirement is not clear to me, thus cannot comment on it… may be you wanted multiple branches to appear from a common point in some branch — then you are welcome to do that.

    “if you have local changes, you have to merge the remote branch inside your repository” — I guess if you want to get those changes, you would need to “merge” them. Actually you could “rebase” them as well (or conviniently just “git pull –rebase”) before making them publicly available on some remote if you want a clean linear history.

  3. Hi Yaroslav,
    As François pointed out, I had a confusion between pull and fetch, it seems that Git can retrieve all branches, I missed this command.

    As for the branches inside a branch, I want unnamed branches inside a branch. This way, I can avoid rebasing everything each time a contractor gives me something. For instance, we are fixing some bugs in some specific GPU kernels at the moment. They prefer using their latest changeset instead of mine to minimize the probability of me adding a bug. I want to incldue their changesets inside our GPU fix branch without having to rebase it or merging it. I want them to be in my repository in the GPU fix branch, and be able to checkout either their head or my head before actually merging them.
    Linear history is not of interest to me in this case, I’d rather avoid it on such complicated code.

  4. Hi Matt,

    sorry — just now ran across to follow up.

    Could you please point me to the description somewhere of what you described as “unnamed branches”? I am still not sure what is going on, i.e. how could anyone “incldue their changesets […] without having to rebase it or merging it” — the only other possibility is to cherry-pick (GIT term), HG’s bundle, or something else?

    P.S. I am only slowly learning HG due to limited exposure/necessity, don’t hit me hard 😉

    Altogether, from what I have seen so far, and from the podcast from HG benevolent dictator Matt Mackall, I can only agree with him — both HG and GIT provide nearly identical sets of functionality, with just slightly different trade-offs/accents; and the choice of HG vs GIT simply depends on the “ecosystem” of your development — if most people use X, stick to X — and taste. If one of them came out to become somewhat used a few months earlier than another, we would not have both now. As a summary, both are great and the only useful distill from the comparisons of the two is the exposure of the functionality to the users using the other one 😉

    Actually today I have ran into a feature which is not readily available among GIT utensils 😉 It seems possible to accomplish but is not just one liner and I have not looked into grafts yet (with which it might become a 2 liner)…

    Could in HG you rebase an entire history with all derived branches and tags from one parent commit to another?

    1. Hi Yaroslav,

      I agree with you, the two tools are very close in terms of capabilities.

      Unnamed branches are the fact that there can be several branches inside one branch. Instead of being forced to merge/rebase/…, you can live with several heads inside the repository. With Git, you have a name for each branch (origin/something, otherplace/something for the “same” something branch). It’s different than rebase/bundle/cherrypick (all can be also done by hg and git).

      For your last question, I have recently done this quite well, but without the tags (didn’t have at the time). I had a complex history that had to be split in two repositories, and in one of them, a lot of the code was obsolete. I rebased the history from a moment where I removed some base folders, and the whole history was rebased. Well I don’t know if it was “rebased”, I think I used MQ or at least part of MQ to do this. In the end, all commits were applied after the new commit, and I could strip the old history.

      What is the feature you ran into? Usually git seems to have more features than hg 😉

  5. “What is the feature you ran into?” — well it was that missing ‘rebase everything derived’ feature 😉

    MQ [note for myself, it is MercurialQueues] — nice and I guess somewhat missing feature from GIT. It feels native to quilt users and avoids forgetting ‘quilt add’ 😉 although from what I see, publishing MQs seems to be quite an ad-hoc…. For GIT I only know few: gbp-pq (from git-buildpackage) which simply converts quilt patch series to/from a patch queue, so the patch queue is kept in the same repository (usually used for debian/patches). Its simplicity and minimalistic interface is kinda appealing. Also there is topgit (http://repo.or.cz/w/topgit.git) which is larger scale patch queues implementation on GIT.

    For your recent endeavor of massive rebasing — did you have multiple derived branches? MQ was capable to branch out nicely?

    As for unnamed branches — I am still lost 😉 sorry
    “several branches inside one branch” does not shine the light upon me and as for “several heads inside the repository” — several local branches (i.e. you do not have to have only origin/something, otherplace/something, you could have simply something and something2)? May be there is some example or a part of manual which could help me out to comprehend?

  6. topgit topgit… and forgotten about stgit — now you can see that I haven’t used any of those two (using pure quilt primarily for that purpose atm since Debian source package v.3 has a dedicated quilt flavor)

  7. Thanks for the link! Beginning of that page sounds indeed interesting/readable but then unfortunately I start asking ‘WTF?’ 😉 could be just an evening effect.

    in any case — now I think I know what was meant by ‘unnamed’ branch, although see no much use for them, besides, as hinted by that article, for short term: If I am doing some rebasing (or resetting to roll back) in GIT, whenever previous “HEADS” become “unnamed” after rebase, I just fire up “gitk –all” before starting doing something evil. Upon any rebase/reset command, HEAD (and pointer for current branch) might move away into a new tree or just to some previous point in history. BUT in gitk window, when you refresh, previous HEADs remain visible, just without any branch associated with them. So, I keep window open and then consult those unnamed HEADS for differences from current new HEADs or just to reset if I screwed up.

    WTF#1: back to that article — “Remote-tracking using separate repositories” — yikes, why would I want to create a clone in GIT to track some remotes, especially if I am going to have no checkout (suggested clone -n)? as to me it is plain waste of space and pollution of the file system 😉 why not just to add a remote to my current GIT? IFF I want not only to track, but work in some other state (either tracking remote, or just local “stable” point with present active ‘build’) — yes, I would create a local clone, and then “git clone -l” is my friend to share objects among the too, thus not wasting too much of the space. Or have I missed the point?

    WTF#2: “Permanent branches” — I simply could not understand what is this all about… the only possible explanation is what is hinted by “storing a name in every changeset” is the fact that sometimes, it is difficult to retrospect (in GIT at least) in which branch any given changeset was created — was is ‘master’ or some development feature branch. If that changeset was merged into some other branches, it becomes listed as belonging to those and GIT itself has no clue where it was committed originally. once I hacked up a tool to provide a basic heuristic to deduce to which branch any given part of the tree belonged — often it is possible due to descriptive automatic merge messages, such as “Merge branch ‘maint/0.4’ into yoh/master” which allows you to mark correspondingly left and right sides of the tree 😉

  8. I’ve alreday used the unnamed branches feature. It’s not just about two heads, it’s about two heads having several changesets behind each of them.

    WTF#1: I guess it’s because Mercurial retrieves everything in the remote repository, and not the branch you want. You may pollute a production repository with inappropriate content. This is an issue and we have to use additional steps to circumvent the issue (although I would never import anything inside my repository if it is not tested on a fresh clone)

    WTF#2: I think this is exactly what you think it is 😉

  9. “two heads having several changesets behind each of them”, sounds to me like it is worth naming those to avoid confusion then, thus arriving to the regular branches?

    somewhat similar pollution in GIT occurs only with TAGs — if you fetch a remote to track, you would automatically fetch all the tags which are pointing to commits children to any of the branches (for the dangling tags, you would need explicit ‘-t’ for fetch).

    WTF#1: so it is a HG feature and was not worth equating it to anything in GIT. WTF resolved 😉
    WTF#2: thanks 😉

  10. I agree with the end choice of this article (Mercurial), but overall this article is full of grammar and spelling issues, making slightly difficult to follow, not to mention that it seems to become more vague as it proceeds.

    One suggestion that would prove the readability of this article, would be to say that you had multiple branches (or heads) download into your *repository* (or clone, or checkout, *not* “branch”, as saying you have multiple branches within your branch becomes a bit confusing).

    1. Hi Jon,

      Thanks for the feedback, and sorry for the English issues. Not always easy when English is not your mother tongue, and I hope my posts will get better over time!

Leave a Reply