Git Tutorial: Branching

From HPC Wiki
Git Tutorials/Branching
Jump to navigation Jump to search



Tutorial
Title: Git Tutorials
Provider: HPC.NRW

Contact: tutorials@hpc.nrw
Type: Online
Topic Area: Revision control
License: CC-BY-SA
Syllabus

1. Basic Git overview
2. Creating and Changing Repositories
3. Branching


Figure 1: Example of basic workflow with branching. The figure can show a local or a remote repository omitting depictions of commit and push operations to keep it simple (they happen somewhere on the dashed lines). Initially, two branches are created from the main branch. Two independent developments occur on those new branches visualized by the different colors green and orange. Theses branches can both belong to one or to two different developers. Once the green segment is finished, its contribution is merged into the main branch and work on the green segment branch stops. On the other side, the orange segment is developed simultaneously. When development is finished, a new branch is created from this branch. Instead of directly merging the orange branch into the main branch it continues development on some light purple features while in parallel some brown features are added to orange (e.g., solving bugs in orange code). Later, a merge operation combines the orange/brown and orange/purple branch together and subsequently those orange/brown/purple changes are merged into the main branch as well. At this point the main branch will contain contributions from all branches. Unlike the green branch, the orange/brown/purple branch continues to exist, because its purple features are still expanded and will later be merged again into the main branch.

Branching is an important feature of Git, which enables simultaneous work on different parts of a code with minimal interference between acting parties. As the name implies, branching refers to the creation of a separate line of repository tracking that branches of from your "main" repository (also called a branch, but can be understood as the trunk). When developing new features you do not necessarily want to directly commit your changes to "main", because they might interfere with your colleagues contribution of other features on a different branch. Even when differences do not directly interfere, they can still lead to a more cumbersome merge of both branches into "main", once all changes are complete. In larger projects this merge ing into the "main" branch is often blocked for most users. A so-called pull request must be made, so that a responsible person can pull your changes from your branch and merge them into "main". To view all existing branches use the git branch --list command:

1user@HPC.NRW:~$ git branch --list
2* main

We can see that currently the project contains only one branch called "main" (line 2). The "*" next to it is used to mark our current branch. Before we continue with the tutorial, a look at Figure 1 might be helpful. It visualizes the basic idea behind branching within a repository. Moreover, we can see the similarities between the relation of other branches to the main branch and the relation of a local repository to the remote repository.

Handling Branches

Creating a new branch is as easy as entering git branch <name of branch>:

1user@HPC.NRW:~$ git branch branchingTut
2user@HPC.NRW:~$ git branch --list
3  branchingTut
4* main
5user@HPC.NRW:~$ git branch -a
6  branchingTut
7* main
8  remotes/origin/main

In above example we can see that creating a new branch does not imply switching to it (the "*" is still next to "main"). Further, displaying all local and remote branches through git branch -a, we see that the new branch does only exist locally and not in the remote repository unlike "main" (line 8). The new branch will be identical to the last commit of the current branch, in our case "main".

To switch to a different branch we use git checkout <name of branch>:

1user@HPC.NRW:~$ git checkout branchingTut
2M	someFile
3Switched to branch 'branchingTut'
4user@HPC.NRW:~$ git branch -a
5* branchingTut
6  main
7  remotes/origin/main

Line 2 informs us about a Modified file named "someFile". This can happen if we make changes to our repository (like Adding or Modifying files), but do not commit them before switching the branch. Naturally, this also happens when we want to switch to a branch with different contents than in hour current branch. Keep in mind that a git checkout <name of branch> is not like a git clone operation and therefore does not make a full copy of a different code branch. Instead, it tracks the commit history of the chosen branch and reproduces the changes in our repository. Any modifications we commit now will be added to that branch.

To delete a branch we type git branch -d <name of branch>. However, this will not be possible if the branch contains uncommitted changes. We either have to perform a commit first or use the option -D instead of -d. If we need to restore a deleted branch we can use git checkout -b <new/old name of branch> <SHA of its last commit>. The SHA-value has been touched upon, previously.

Sometimes we want to switch to a different branch, but we have uncommitted changes in the current branch. In such a case, Git will not allow us to switch. If we do not want to perform a commit we can use the git stash functionality. With it we can safely put away all changes since the last commit. Afterwards, we can restore what has been put into the stash with git stash pop. Here is an example:

 1otherUser@HPC.NRW:~$ git checkout main
 2error: Your local changes to the following files would be overwritten by checkout:
 3	Branching
 4Please commit your changes or stash them before you switch branches.
 5Aborting
 6otherUser@HPC.NRW:~$ git stash
 7Saved working directory and index state WIP on branchingTut: 802006b Add branch deletion to conflictless merge
 8otherUser@HPC.NRW:~$ git checkout main
 9Switched to branch 'main'
10Your branch is ahead of 'origin/main' by 15 commits.
11  (use "git push" to publish your local commits)
12otherUser@HPC.NRW:~$ git checkout branchingTut 
13Switched to branch 'branchingTut'
14otherUser@HPC.NRW:~$ git stash pop
15On branch branchingTut
16Changes not staged for commit:
17  (use "git add <file>..." to update what will be committed)
18  (use "git restore <file>..." to discard changes in working directory)
19	modified:   Branching
20
21no changes added to commit (use "git add" and/or "git commit -a")
22Dropped refs/stash@{0} (fe7bb4e2c792a6dcecd4e576a22a6640410cb078)

It is possible to have multiple stashes at once and even to apply them to other branches and much more. If you are interested we suggest Git's easy to understand section on working with stashes.

Remote Branches

If others are supposed to have access to our branch (or we want to have access from different locations), we need to make sure that it also exists in the remote repository by entering git push <name of remote repository> <name of branch>:

 1user@HPC.NRW:~$ git push origin branchingTut
 2Username for 'https://github.com': HPC.NRW-User
 3Password for 'https://HPC.NRW-User@github.com': ********
 4#Enumerating objects: 15, done.
 5Counting objects: 100% (15/15), done.
 6Delta compression using up to 12 threads
 7Compressing objects: 100% (13/13), done.
 8Writing objects: 100% (13/13), 42.00 KiB | 10.50 MiB/s, done.
 9Total 13 (delta 6), reused 0 (delta 0)
10remote: Resolving deltas: 100% (6/6), completed with 1 local object.
11remote: 
12remote: Create a pull request for 'branchingTut' on GitHub by visiting:
13remote:      https://github.com/HPC.NRW-User/Wikipages/pull/new/branchingTut
14remote: 
15To https://github.com/HPC.NRW-User/Wikipages.git
16 * [new branch]      branchingTut -> branchingTut
17 
18user@HPC.NRW:~$ git branch -a
19* branchingTut
20  main
21  remotes/origin/branchingTut
22  remotes/origin/main

This is very similar to the previous push operations with the only difference being that instead of the "main" branch in the remote repository, we will be using the "branchingTut" branch. As this branch did not exist yet, it will be created for us. In lines 18 to 22 we can see all existing branches including the newly created remote branch.


A remote checkout works the same way a local checkout does. The main difference is that your local repository might not be up to date with remote branch information, which is why we might miss updates on existing branches (as well as respective git log data). Below, you can see Git output for a user who cloned our repository after the first tutorial, but before branching was introduced:

otherUser@HPC.NRW:~$ git branch -a
* main
  remotes/origin/HEAD -> origin/main
  remotes/origin/main

In order to see the newest changes in the remote repository, they need to use the fetch command:

 1otherUser@HPC.NRW:~$ git fetch
 2Username for 'https://github.com': HPC.NRW-otherUser
 3Password for 'https://HPC.NRW-otherUser@github.com': 
 4remote: Enumerating objects: 15, done.
 5remote: Counting objects: 100% (15/15), done.
 6remote: Compressing objects: 100% (7/7), done.
 7remote: Total 13 (delta 6), reused 13 (delta 6), pack-reused 0
 8Unpacking objects: 100% (13/13), done.
 9From https://github.com/HPC.NRW-User/Wikipages
10 * [new branch]      branchingTut -> origin/branchingTut
11
12otherUser@HPC.NRW:~$ git branch -a
13* main
14  remotes/origin/HEAD -> origin/main
15  remotes/origin/branchingTut
16  remotes/origin/main

Line 10 informs them that a new branch exists in the remote repository. Lines 12 to 16 show that they can see the "branchingTut" branch, however, they still do not have a local version of it. Using git checkout branchingTut they can then create a local branch based on the remote one. As the branch already exists in the remote repository, the option -b is not required for branch creation using checkout.


Merging Branches

Once work on a branch has progressed sufficiently, the changes can be merged into "main", i.e. that "main" will contain all changes and additions from the other branch. If the changes on the branch concern a separate file or new section in existing files, the merge can be performed without any additional work. However, sometimes merge conflicts can occure, for example when two branches want to modify the same code sections or when "main" has changed after a branch has been created from it. We will cover both scenarios beginning with the easy one, where everything works fine.

Generally, git merge <name of branch to merge> combines the given branch into our currently active branch, but first we need some content to merge. For a conflict free merge we will do the following:

  1. Switch to "main"
  2. Make sure it is locally up to date
  3. Create an new branch "tempBranch" out of "main"
  4. Create a new file "tempFile" and add it to "tempBranch"
  5. Merge "tempBranch" into "main", so that "tempFile" can also be found in "main"
  6. Write something into the first line of "tempFile" on "main"
  7. Write the same thing into the first line of "tempFile" on "tempBranch"
  8. Write something into the second line of "tempFile" on "tempBranch"
  9. Merge "tempBranch" into "main" again
  10. Remove "tempBranch"

Above, we have purposefully not mentioned any commit operations. Try it yourself or check the solution below.

Next, we will create a merge with a conflict. As mentioned before, a conflict can occur when two users try to modify the same parts of a code, because it becomes unclear which changes have to be incorporated into the merge and which not. To reproduce such a situation, we can mostly repeat the steps from the previous merge without conflicts. Here are some instructions with a solution and remarks below:

  1. Create two branches "tempBranch" and "toMergeBranch" (does not matter of which branch)
  2. Create a file "tempFile" on both branches
  3. Write something in the first line of "tempFile" on "tempBranch"
  4. Write something in the second line of "tempFile" on "toMergeBranch"
  5. Merge "toMergeBranch" into "tempBranch" providing a merge message
  6. Resolve the conflict (see conflict solving basics below)
  7. Delete branches "tempBranch" and "toMergeBranch"

Below we provide an example for a file containing a merge conflict. In this case, it is the file "tempFile" from the exercise above.

<<<<<<< HEAD
write in first line
=======

something in second line
>>>>>>> toMergeBranch

The <<<<<<< HEAD marks the beginning of a difference or conflict in the file on our current branch (the branch we wanted to merge into), ======= separates this content from the content in the branch we wanted to merge. Therefore, everything between ======= and >>>>>>> toMergeBranch is the corresponding content on branch "toMergeBranch". Anything outside of those markers is identical on both branches. Further, those markers can appear multiple times in one file, should several conflicts exist. An easy way to check which files contain conflicts is to look into ./.git/MERGE_MSG. This file is only present while a merge conflict persists.


Best Practices

  • Always start a new branch when working on a new feature and merge your changes and additions into the main branch, once the work is finished.
  • Common branches are: main, develop, feature, release, hotfix. Naturally, those branches can have branches as well, e.g., when multiple features are being developed at the same time. However, depending on your environment, completely different naming conventions can be sensible.
  • Several strategies exist on the structured creation of branches and branching conventions. We cover them here.

Glossary

  • branch: Used for branch operations (e.g., creation or deletion).
  • switch: Similar to checkout, but reserved for switching branches. New feature, so functionality might change.
  • fetch: Used to update information on remote repositories (and branches).
  • merge: Used to combine one branch into another.
  • stash: Used to temporarily store away uncommitted changes. Can be retrieved with a git stash pop.


Useful Links