Git Tutorial: Creating and Changing Repositories
|Topic Area:||Revision control|
|1. Basic Git overview|
|2. Creating and Changing Repositories|
Most of the users are only interested in having some kind of revision control for their own code. Therefore, we will start with basics, which will allow you to put your code under revision control in a few simple steps, and you will also learn how to easily add changes to the code (this will be done locally on your hard drive; utilizing GitHub follows later). At first, you will need to create a repository for your code. To fully grasp what a repository is, you can read the corresponding Wikipedia page, but for now it is sufficient to think of it as a box into which you will put everything that you want to have under revision control. Additionally, the box contains a list, which tracks everything put into the box, everything taken out of it and any changes made to the objects inside.
This tutorial will consist of two blocks. The first one focuses on your local repository and the following contents (we will stick with the box analogy for now):
initializing your repository: You create the box
adding files to the repository: You note on the list what goes into the box and what changes to apply
committing your changes to the repository: You put the objects and apply the changes to the box as has been noted on the list
- Excluding files from your repository: Deciding what should never be noted for the box
The second block will teach you how to be ready for collaboration by using a remote repository - a box used by everyone. Concepts here are similar to those for the local repository, and the topics covered are:
pushing your latest
committo the remote repository: Applied changes to your local box should be applied to the remote box, too
pulling the latest version from the remote repository: Apply changes from the remote box to my box
Keep in mind that
pull usually require additional
merge work, which will be covered in a later tutorial. However, if you stick with this tutorial, you should be fine.
Many of git's command line interactions can be easily reproduced through GitHub's GUI. We strongly recommend that, additionally to working through this tutorial, you take a peek at the suggested interactive GitHub tutorials in the Useful Links section. They were chosen to closely match this Wiki tutorial.
Creating a new Repository
Before we start, please make sure that you fulfill the following conditions:
- You have some kind of Linux distribution installed
- You have installed Git
- You have a folder (we will call it MyFolder), that you want to put under revision control
- You have set your identity and email through the two commands
user@HPC.NRW:~$ git config --global user.name "Your Name Comes Here" user@HPC.NRW:~$ git config --global user.email firstname.lastname@example.org
Once those conditions are met, go into MyFolder and initialize the repository by entering
1user@HPC.NRW:~$ cd MyFolder 2user@HPC.NRW:~MyFolder/$ 3user@HPC.NRW:~MyFolder/$ git init 4Initialized empty Git repository in /home/MyFolder/.git/
Your repository has been initialized (line 4). Your box has been created, but currently it is still empty. You need to note with which content you want to fill the box/repository by using the
git add command. You can simply note to add the whole directory with
user@HPC.NRW:~MyFolder/$ git add .
If you want to note the addition of one or more particular files you can do this through
user@HPC.NRW:~MyFolder/$ git add someFiles* andMore
which in this case would note to
add the file andMore and any files beginning with someFiles into the repository (e.g., someFiles2 or someFilesWithMoreTextAndNumb3rsAtTheEnd).
Once you are sure what goes in, it is time to put it in the box and apply the changes. This is done through the
git commit command. It has to be accompanied by a message, which describes the changes you have made to the repository. Using the imperative in
commit messages is a widely suggested convention. The message can be directly written with the
commit through the option
user@HPC.NRW:~MyFolder/$ git commit -m "Create repository/box and fill it with some initial files"
This command will be followed by information regarding your
commit like the amount of files and changes it contained. At this point, it will also inform you that you have committed to your master branch. Branches will be covered later, so you do not need to worry about them.
Modifying your Repository
You actually already know everything you need to apply further changes to your repository/box. Let's assume you want to
add a new file newFile to the repository and that you have also made some adjustments to the file andMore, which is already inside the box. As the box is already created, you only need to note what you want to add and change:
user@HPC.NRW:~MyFolder/$ git add andMore newFile
Of course, you can also simply use
git add . to note all the changes that have been made so far. Afterwards, put everything in the box and apply the changes through
git commit -m "My commit message". If you have not added any new files, but only made changes to existing ones, then you can type
user@HPC.NRW:~MyFolder/$ git commit -a -m "My commit message with no new files added"
This will automatically perform the
git add command on files which have changed since your last
commit (again: No new files will be
added this way!).
Excluding files from the Repository
Usually you want to keep your repository tidy without unnecessary file bloat. For example, this can happen when you compile your repository's code and keep the binaries inside. Entering
git add . would result in binaries being
added to the repository, which is usually not desirable as they lead to larger repository size.
The easiest way to control what should not go in your repository is the
.gitignore file. Simply create the file in your repository and note inside, what you want to exclude from revision control. If you do not want to track any
.txt files and the
logo.jpg file in particular, you could do the following:
user@HPC.NRW:~MyFolder/$ touch .gitignore user@HPC.NRW:~MyFolder/$ echo -e "*.txt\nlogo.jpg" > .gitignore
Naturally, you can exclude anything, but below you find a
.gitignore example which contains a comprehensive list of file types that can usually be ignored.
Going back to a previous Version
One advantage of revision control is the possibility of easily going back to an older version of your repository. As with every
git add you have noted the changes made to the repository's contents, git can use these notes to switch back to a previous state/version. In order to see all existing versions, use the
git log command:
1user@HPC.NRW:~MyFolder/$ git log 2commit aab2d012c5a5965d14c440a6727191c19625e6e3 (HEAD -> master) 3Author: user <email@example.com> 4Date: Thu Jul 5 14:15:09 2000 +0200 5 6 add .gitignore file 7 8commit 30763897a2fc05b13f3ecb3197a9243b4a7941d8 9Author: user <firstname.lastname@example.org> 10Date: Wed Jul 1 03:43:57 2000 +0200 11 12 Create repository/box and fill it with some initial files
This yields information about the authors who made changes to the repository, dates, related
commit messages (lines 6 and 12) and
commit names/references (lines 2 and 8, the string after
commit). You can use
git show <commit reference> for detailed information on a particular
commit. The long reference is actually not required, and the shortened version you get with
git log --oneline can be used instead. Remember to inspect the different options you can pass to
git log for extended or filtered output.
Once you know which version you want to go back to, use the
git checkout <commit reference> . command (with
. at the end):
user@HPC.NRW:~MyFolder/$ git checkout 3076389 . Updated 1 path from 83ca202
Please keep in mind that you should attach an appropriate message to your next
commit to keep everybody aware of version revert due to a
checkout. Generally, such reverts should not be done without caution, even more so when cooperating on a project. It is preferable to use branches in such cases - a topic touched later upon.
Using a remote Repository
Although Git is useful for keeping track of your own projects, it becomes most valuable when collaborating with others on the same code. In such cases, it is useful to set up a remote repository to which everyone has access and can contribute their changes. This tutorial will focus on how to set up your remote repository with the help of GitHub. At the end, we will also show you how to
clone a remote repository from GitHub and start contributing to an existing project. Please note that you can set the Git remote repository to be anywhere, e.g., a location in your facilities own network, but for simplicity's sake we stick with GitHub only.
Before we start, please make sure that you have
- registered a GitHub account
- an existing Git repository to upload to GitHub (you can use your local repository from the previous section)
Now, log into your account and do the following preparations:
- Start creating a new GitHub repository
- Give the repository an appropriate name, we choose "Wikipages"
- Set yourself as the owner
- Add some short description
- Set the repository to private
- Do not add any
READMEfile, as you are going to use your existing local repository, which should at least have an
- Finish creating the repository
Afterwards, you associate your local repository with your remote one and
push your latest
commit to GitHub's repository:
1user@HPC.NRW:~MyFolder/$ git remote add origin https://github.com/HPC.NRW-User/Wikipages.git 2user@HPC.NRW:~MyFolder/$ git branch -M main 3user@HPC.NRW:~MyFolder/$ git push -u origin main 4Username for 'https://github.com': HPC.NRW-User 5Password for 'https://HPC.NRW-User@github.com': ******** 6Enumerating objects: 5, done. 7Counting objects: 100% (5/5), done. 8Delta compression using up to 12 threads 9Compressing objects: 100% (3/3), done. 10Writing objects: 100% (3/3), 341 bytes | 341.00 KiB/s, done. 11Total 3 (delta 2), reused 0 (delta 0) 12remote: Resolving deltas: 100% (2/2), completed with 2 local objects. 13To https://github.com/HPC.NRW-User/Wikipages.git 14 75895d2..b4e0116 main -> main
origin in line 1 is an arbitrary name (though by convention) chosen for your remote repository. In theory, you could have several remote repositories and always chose which one to use. For now, it should be sufficient to stick with conventions and use only one remote repository named origin. Line 2 generates a branch called main to which you will be automatically set. Branches will be covered in the next tutorial, but it is beneficial to start using them early on, in particular when collaborating on a project. Line 3 is the actual
push, meaning that afterwards the same files and changes as in your local repository will be put into the remote one. Do not forget to
commit your changes before
pushing them to make sure that everything is up-to-date. However, if you were following this tutorial step by step, then no
commit was necessary here, because no changes have been performed before the
GitHub provides an interactive tutorial for moving your local repository to GitHub. You can check it out in the Useful Links section.
The next step is to learn how you can join an existing remote repository. Usually, you have to make sure that others (or you yourself) are allowed to contribute to a remote repository and to download from it. In our case, you are the only user and therefore no further settings are required. For practicing purposes, we will continue using the current remote repository. The first thing you often want to do is to
clone an existing repository, which basically means, you are going to create an empty box and put into it exactly what is in the remote repository box:
user@HPC.NRW:~MyFolder/$ mkdir ~/myClonedFolder user@HPC.NRW:~MyFolder/$ cd ~/myClonedFolder user@HPC.NRW:~MyFolder/$ git init user@HPC.NRW:~MyFolder/$ git clone https://github.com/HPC.NRW-User/Wikipages.git Cloning into 'Wikipages'... Username for 'https://github.com': HPC.NRW-User Password for 'https://HPC.NRW-User@github.com': ******** remote: Enumerating objects: 44, done. remote: Counting objects: 100% (44/44), done. remote: Compressing objects: 100% (23/23), done. remote: Total 44 (delta 23), reused 42 (delta 21), pack-reused 0 Unpacking objects: 100% (44/44), done.
That’s it. You have successfully
cloned a remote repository and can start working on it.
The last thing you will learn in this tutorial is how to get
pushed changes from a remote repository. The prerequisite here is, that you already have a local version of a remote repository (through either
git clone or
git remote add). If you have been following the tutorial up to this point, this should be a given.
What we will do now is:
- Go to the
- Change a file in your
cloned repository in
~/myClonedFolderand add a new file clonedRepoFile to it
add(or note) the changes you want to do
committo the remote repository
- Go back to the initial local repository in
- Get the recently
pushed changes from the remote repository into the local repository in
The above points correspond to the following lines:
1user@HPC.NRW:~MyFolder/$ cd ~/myClonedFolder 2user@HPC.NRW:~MyFolder/$ touch clonedRepoFile 3user@HPC.NRW:~MyFolder/$ git add clonedRepoFile someChangedFile 4user@HPC.NRW:~MyFolder/$ git commit -m “add file clonedRepoFile and make some small changes” 5user@HPC.NRW:~MyFolder/$ git push origin main 6user@HPC.NRW:~MyFolder/$ cd ~/myFolder 7user@HPC.NRW:~MyFolder/$ git pull origin main
Afterwards, the contents of your folders myFolder and myClonedFolder should be identical.
pull is most commonly used to be up-to-date with changes other contributors have
pushed and not for going back to an older version. The latter will more often than not lead to conflicts with other versions which need to be painstakingly resolved and that will propagate to future versions for your colleagues, once you have
pushed your changes after
pulling an older version.
- This cannot be understated: WORK THROUGH THE NEXT TUTORIAL ON BRANCHES
- Do not
committo the remote repository. Do
commits regularly to your local repository, but only
pushthem, once a significant amount of changes has been made.
pullare not meant for jumping back and forth between different versions of code.
checkoutshould be used sparsely, while
pullis intended to get your local repository up-to-date with other's contributions to the remote repository.
commitmessages should be used in imperative form ("add" instead of "added" or "Write new method for..." instead of "Wrote new method for..."). However, this is just a convention.
commitmessages should give an idea of what the functional idea of the *
commitis. Therefore, a detailed description of new methods should not be part of the message, because that is what the code itself is for. In general the messages should be short (less than 50 characters).
add: Keeps track of what should be adjusted in the local repository with your next
commit: Files marked for tracking (aka staging) with
addwill be adjusted in the current branch of your repository (other branches can be specified).
push: Take your most recent
commitand ask for your changes to be integrated into the most recent remote repository version of the desired branch. Usually, when collaborating on a project, this does not mean that the remote version will be identical to yours afterwards, because others have been pushing their changes, too.
checkout: Adjust files in your current directory to a different version (or a different branch).
pull: Similar to
checkout, but you access a version from a remote repository. Usually requires some
mergeadjustments (which have not been covered, yet).