Keep your git history clean using rebase

Thanks to a new colleague of mine, I have learned how to make my git history cleaner and more understandable.

The principle is simple: rebase your branch before you merge it. But this technique also has weaknesses. In this article, I will explain what a rebase and a merge really do and what are the implications of this technique.

Basically, here is an example of my git history before and after I used this technique.

Git history

Stay focus, rebase and merge are no joke! :)

What is the goal of a rebase or a merge ?

Rebase and merge both aim at integrating changes that happened on another branch into your branch.

What happens during a merge ?

First of all there are two types of merge:

  • Fast-forward merge
  • 3-way merge

Fast-forward merge

A fast-forward merge happens when the most recent shared commit between the two branches is also the tip of the branch in which you are merging.

The following drawing shows what happens during a fast-forward merge and how it is shown on a graphical git software.

A: the branch in which you are merging

B: the branch from which you get the modifications

  • git checkout A
  • git merge B

Merge fast-forward (1)

As you can see, git simply brings the new commits on top of branch A. After a fast-forward merge, branches A and B are exactly the same.

Notes:

  • git checkout A, git rebase B you would have had the exact same result!
  • git checkout B, git merge A would have left the branches in the “before” situation, since branch A has no new commits for branch B.

3-way merge

A 3-way merge happens when both branches have had new commits since the last shared commit.

The following drawing shows what happens during a 3-way merge and how it is shown in a graphical git software.

A: the branch in which you are merging

B: the branch from which you get the modifications

  • git checkout A
  • git merge B

3-way Merge (1)

During a 3-way merge, git creates a new commit named “merge commit” (in orange) that contains:

  • All the modifications brought by the three commits from B (in purple)
  • The possible conflict resolutions

Git will keep all information about the commits of the merged branch B even if you delete it. On a graphical git software, git will also keep a small loop to represent the merge.

The default behaviour of git is to try a fast-forward merge first. If it’s not possible, that is to say if both branch have had changes since the last shared commit, it will be a 3-way merge.

What happens during a rebase?

A rebase differ from a merge in the way in which it integrates the modifications.

The following drawings show what happens during a rebase and how it is shown in a graphical git software.

A: the branch that you are rebasing

B: the branch from which you get the new commits

  • git checkout A
  • git rebase B

Rebase (1)

Rebase (graphical git software) (1)

When you rebase A on B, git creates a temporary branch that is a copy of branch B, and tries to apply the new commits of A on it one by one.

For each commit to apply, if there are conflicts, they will be resolved inside of the commit.

After a rebase, the new commits from A (in blue) are not exactly the same as they were:

  • If there were conflicts, those conflicts are integrated in each commit
  • They have a new hash

But they keep their original date which might be confusing since in the final branch, commits in blue were created before the two last commits in purple.

What is the best solution to integrate a new feature into a shared branch and keep your git tree clean?

Let say that you have a new feature made of three new commits on a branch named `feature`. You want to merge this branch into a shared branch, for exemple `master` that has received two new commits since you started from it.

You have two main solutions:

First solution:

  • git checkout feature
  • git rebase master
  • git checkout master
  • git merge feature

Rebase and merge (master, feature) (1)

Note : Be careful, git merge feature should do a fast-forward merge, but some hosting services for version control do a 3-way merge anyway. To prevent this, you can use git merge feature –ff-only

Second solution:

  • git checkout master
  • git merge feature

3-way Merge (master, feature) (1)

As you can see, the final tree is more simple with the first solution. You simply have a linear git history. On the opposite, the second solution creates a new “merge commit” and a loop to show that a merge happened.

In this situation, the git tree is still readable, so the advantage of the first solution is not so obvious. The complexity emerges when you have several developers in your team, and several feature branches developed at the same time. If everyone uses the second solution, your git tree ends up complex with several loop, and it can even be difficult to see which commits belong to which branch!
Unfortunately, the first solution has a few drawbacks:

History rewriting

When you use a rebase, like in the first solution, you “rewrite history” because you change the order of past commits on your branch. This can be problematic if several developers work on the same branch: when you rewrite history, you have to use git push – – force in order to erase the old branch on the remote repository and put your new branch (with the new history) in its place.

This can potentially erase changes another developer made, or introduce conflicts resolution for him.

To avoid this problem, you should only rebase branches on which you are the only one working. For example in our case, if you are the only one working on the feature branch.

However you might sometime have to rewrite history of shared branches. In this case, make sure that the other developers working on the branch are aware of it, and are available to help you if you have conflicts to resolve.

The obvious advantage of the 3-way merge here, is that you don’t rewrite history at all.

Conflicts resolution

When you merge or rebase, you might have to resolve conflicts.

What I like about the rebase, is that the conflicts added by one commit will be resolved in this same commit. On the opposite, the 3-way merge will resolve all the conflicts into the new “merge commit”, mixing all together the conflicts added by the different commits of your feature branch.

The only problem with the rebase is that you may have to resolve more conflicts, due to the fact that the rebase applies the commits of your branch one by one.

Conclusion

To conclude, I hope I have convinced you that rebasing your branch before merging it, can clear your git history a lot! Here is a recap of the advantages and disadvantages of the rebase and merge method versus the 3-way merge method:

Capture d’écran 2018-09-17 à 09.13.46

 


You liked this article? You'd probably be a good match for our ever-growing tech team at Theodo.

Join Us

  • corvec

    Any of your readers should also familiarize themselves with the opposing argument (that rebasing in this way should be done sparingly and not as a policy): http://paul.stadig.name/2010/12/thou-shalt-not-lie-git-rebase-ammend.html

  • Jérémie Marniquet Fabre

    Thank you, your article points out some weaknesses of the rebase that I didn’t know about… It’s good to see several points of view!

  • Matthieu Brucher

    There is still a big issue with history rewrite: you never tested the independent commits.
    With a traditional merge, you have your build system that analyzed and tested each of your commits.
    If you rebase, you loose this knowledge, and the individual steps are actually not that worthwhile anymore.
    Another approach used in sklearn is to squash the merge, so you do a rebase, but as you may have lots of add/delete and as your commits are no longer relevant (and may not even build), except the final merge one, you may only require one, squashing all of them together.
    So either standard merge (which is actually very readable) or squash, but rebase is just a half-baked solution IMHO.