Git Tutorial: Creating and Changing Repositories
Tutorial | |
---|---|
Title: | Git Tutorials |
Provider: | HPC.NRW
|
Contact: | tutorials@hpc.nrw |
Type: | Online |
Topic Area: | Revision control |
License: | CC-BY-SA |
Syllabus
| |
1. Basic Git overview | |
2. Creating and Changing Repositories | |
3. Branching |
Most of the users are only interested in having some kind of revision control for their own code. Therefore, we will start with basics, which will allow you to put your code under revision control in a few simple steps and you will also learn how to easily add changes the code (this will be done locally on your hard drive; utilizing GitHub follows later). At first you will need to create a repository for your code. To fully grasp what a repository is, you can read the corresponding Wikipedia page, but for now it is sufficient to think of it as a box into which you will put everything that you want to have under revision control. Additionally, the box contains a list, which tracks everything put into the box, everything taken out of it and any changes made to the objects inside.
This tutorial will consist of the following phases (we will stick with the box analogy for now):
init
ializing your repository: You create the boxadd
ing files to the repository: You note on the list what goes in the box and what changes to applycommit
ting your changes to the repository: You put the objects and apply the changes to the box as has been noted on the list
Creating a new Repository
Before we start, please make sure that you fulfill the following conditions:
- You have some kind of Linux distribution installed
- You have installed Git
- You have a folder (we will call it MyFolder), that you want to put under revision control
- You have set your identity and email through the two commands
user@HPC.NRW:~$ git config --global user.name "Your Name Comes Here" user@HPC.NRW:~$ git config --global user.email you@yourdomain.example.com
Once those conditions are met, go into MyFolder and initialize the repository by entering
1user@HPC.NRW:~$ cd MyFolder
2user@HPC.NRW:~MyFolder/$
3user@HPC.NRW:~MyFolder/$ git init
4Initialized empty Git repository in /home/MyFolder/.git/
Your repository has been initialized (line 4). Your box has been created, but currently it is still empty. You need to note with which content you want to fill the box/repository by using the git add
command. You can simply note to add the whole directory with
user@HPC.NRW:~MyFolder/$ git add .
If you want to note the addition of one or more particular files you can do this through
user@HPC.NRW:~MyFolder/$ git add someFiles* andMore
which in this case would note to add
the file andMore and any files beginning with someFiles into the repository (e.g., someFiles2 or someFilesWithMoreTextAndNumb3rsAtTheEnd).
Once you are sure what goes in, it is time to put it in the box and apply the changes. The git commit
command is used to put something in and apply the changes something. This command has to be accompanied by a message, which describes the changes you have made to the repository. The fastest way to do this is by adding the message directly to the commit
through the option -m
:
user@HPC.NRW:~MyFolder/$ git commit -m "Repository/Box has been created and filled with some initial files"
This command will be followed by information regarding your commit
like the amount of files and changes it contained. At this point, it will also inform you that you have committed to your master branch. Branches will be covered later, so you do not need to worry about them.
Modifying your Repository
You actually already know everything you need to apply further changes to your repository/box. Let's assume you want to add
a new file newFile to the repository and that you have also made some adjustments to the file andMore, which is already inside the box. As the box is already created, you only need to note what you want to add and change:
user@HPC.NRW:~MyFolder/$ git add andMore newFile
Of course you can also simply use git add .
to note all the changes that have been made so far. Afterwards, put everything in the box and apply the changes through git commit -m "My commit message"
. If you have not added any new files, but only made changes to existing ones, then you can type
user@HPC.NRW:~MyFolder/$ git commit -a -m "My commit message with no new files added"
This will automatically perform the git add
command on files which have changed since your last commit
(again: No new files will be added
this way!).
Deciding what should never go into the Repository
Usually you want to keep your repository tidy without unnecessary file bloat. For example, this can happen when you compile your repository's code and keep the binaries inside. Entering git commit -a
or git add .
would result in binaries being add
ed to the repository, which is usually not desirable as they lead to larger repository size.
The easiest way to control what should not go in your repository is the .gitignore
file. Simply create the file in your repository and note inside, what you do not want to have revision control for. If you do not want track any .txt
files and the logo.jpg
file in particular you could do the following:
user@HPC.NRW:~MyFolder/$ touch .gitignore
user@HPC.NRW:~MyFolder/$ echo -e "*.txt\nlogo.jpg" > .gitignore
Naturally, you can add
anything, but below you find a .gitignore
example which contains a good list of file types that can usually be ignored.
Example of a comprehensive .gitignore
file
# SLURM output #
################
*.out
*.err
# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so
# Packages #
############
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
*.tar
*.zip
# Logs and databases #
######################
*.log
*.sql
*.sqlite
# OS generated files #
######################
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
Going back to a previous Version
One advantage of revision control is the possibility of easily going back to an older version of your repository. As with every git add
you have noted the changes made to the repository's contents, git can use these notes to switch back to a previous state/version. In order to see all existing versions use the git log
command:
1user@HPC.NRW:~MyFolder/$ git log
2commit aab2d012c5a5965d14c440a6727191c19625e6e3 (HEAD -> master)
3Author: user <tutorials@hpc.nrw>
4Date: Thu Jul 5 14:15:09 2000 +0200
5
6 add .gitignore file
7
8commit 30763897a2fc05b13f3ecb3197a9243b4a7941d8
9Author: user <tutorials@hpc.nrw>
10Date: Wed Jul 1 03:43:57 2000 +0200
11
12 Create repository/box and fill it with some initial files
This yields information about the authors who made changes to the repository, dates, related commit
messages (lines 6 and 12) and commit
names/references (lines 2 and 8, the string after commit
). You can use git show <commit reference>
for detailed information on a particular commit
. The long reference is actually not required and the shortened version you get with git log --oneline
can be used instead. Remember to inspect the different options you can pass to git log
for extended or filtered output.
Once you know which version you want go back to, use the git checkout <commit reference> .
command (with .
at the end):
user@HPC.NRW:~MyFolder/$ git checkout 3076389 .
Updated 1 path from 83ca202
Please keep in mind that you should attach an appropriate message to your next commit
to keep everybody aware of version revert due to a checkout
. Generally, such reverts should not be done without caution, even more so when cooperating on a project. It is preferable to use branches in such cases - a topic touched later upon.
Using a remote Repository
Although Git is useful for keeping track of your own projects, it becomes most valuable when collaborating with others on the same code. In such cases, it is useful to setup a remote repository to which everyone has access and can contribute their changes. This tutorial will focus on how to setup your remote repository with the help of GitHub. At the end we will also show you how to clone
a remote repository from GitHub and start contributing to an existing project.
Before we start, please make sure that you have
- registered a GitHub account
- an existing Git repository to upload to GitHub (you can use your local repository from the previous section)
Now, log into your account and do the following preperations:
- Start creating a new GitHub repository
- Give the repository an appropriate name, we choose "Wikipages"
- Set yourself as the owner
- Add some short description
- Set the repository to private
- Do not add any
.ignore
orREADME
file, as you are going to use your existing local repository, which should at least have an.ignore
file - Finish creating the repository
Afterwards you associate your local repository with your remote one and push
your commit
to GitHub's repository:
1user@HPC.NRW:~MyFolder/$ git remote add origin https://github.com/HPC.NRW-User/Wikipages.git
2user@HPC.NRW:~MyFolder/$ git branch -M main
3user@HPC.NRW:~MyFolder/$ git push -u origin main
4Username for 'https://github.com': HPC.NRW-user
5Password for 'https://HPC.NRW-user@github.com': ********
6Enumerating objects: 5, done.
7Counting objects: 100% (5/5), done.
8Delta compression using up to 12 threads
9Compressing objects: 100% (3/3), done.
10Writing objects: 100% (3/3), 341 bytes | 341.00 KiB/s, done.
11Total 3 (delta 2), reused 0 (delta 0)
12remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
13To https://github.com/HPC.NRW-user/Wikipages.git
14 75895d2..b4e0116 main -> main
The origin
in line 1 is an arbitrary name (though by convention) chosen for your remote repository. In theory you could have several remote repositories and always chose which one to use. For now it should be sufficient to stick with conventions and use only one remote repository named origin. Line 2 generates a branch called main to which you will be automatically set. Branches will be covered in the next tutorial, but it is beneficial to start using them early on, in particular when collaborating on a project. Please note that you can set the Git remote repository to be anywhere, e.g., a location in your facilities own network, but for simplicity's sake we stick with GitHub only. Line 3 is the actual push
, meaning that afterwards the same files and changes as in your local repository will be put into the remote one. Do not forget to commit
your changes before push
ing them to make sure that everything is up-to-date. However, if you were following this tutorial step by step, then no commit
was necessary here, because no changes have been performed before the push
.
GitHub provides an interactive tutorial for moving your local repository to GitHub. You can check it out in the Useful Links section.
The next step is to learn how you can join an existing remote repository. Usually, you have to make sure that others (or you yourself) are allowed to contribute to a remote repository and to download from it. In our case, you are the only user and therefore no further settings are required. For practicing purposes, we will continue using the current remote repository.
The first thing you often want to do is to clone
an existing repository, which basically means, you are going to ‘’’create’’’ an empty box and ‘’’put’’’ into it exactly what is in the remote repository box:
user@HPC.NRW:~MyFolder/$ mkdir ~/myClonedFolder
user@HPC.NRW:~MyFolder/$ cd ~/myClonedFolder
user@HPC.NRW:~MyFolder/$ git init
user@HPC.NRW:~MyFolder/$ git clone https://github.com/HPC.NRW-User/Wikipages.git
Cloning into 'Wikipages'...
Username for 'https://github.com': HPC.NRW-User
Password for 'https://HPC.NRW-user@github.com': ********
remote: Enumerating objects: 44, done.
remote: Counting objects: 100% (44/44), done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 44 (delta 23), reused 42 (delta 21), pack-reused 0
Unpacking objects: 100% (44/44), done.
That’s it. You have successfully clone
d a remote repository and can start working on it.
The last thing you will learn in this tutorial is how to get push
ed changes from a remote repository. The prerequisite here is, that you already have a local version of a remote repository (through either git clone
or git remote add
). If you have been following the tutorial up to this point, this should be a given.
What we will do now is:
- Go to the
~/myClonedFolder
directory - Change a file in your
clone
d repository in~/myClonedFolder
and add a new file clonedRepoFile to it add
(or note) the changes you want to docommit
the changespush
the lastcommit
to the remote repository- Go back to the initial local repository in
~/MyFolder
- Get the recently
push
ed changes from the remote repository into the local repository in~/MyFolder
Above points correspond to the following lines:
user@HPC.NRW:~MyFolder/$ cd ~/myClonedFolder
user@HPC.NRW:~MyFolder/$ touch clonedRepoFile
user@HPC.NRW:~MyFolder/$ git add clonedRepoFile someChangedFile
user@HPC.NRW:~MyFolder/$ git commit -m “add file clonedRepoFile and make some small changes”
user@HPC.NRW:~MyFolder/$ git push origin main
user@HPC.NRW:~MyFolder/$ cd ~/myFolder
user@HPC.NRW:~MyFolder/$ git pull origin main
Afterwards, the contents of your folders MyFolder and myClonedFolder should be identical. pull
is most commonly used to be up-to-date with changes other contributers have push
ed and not for going back to an older version. The latter will more often than not lead to conflicts with other versions which need to be painstakingly resolved and that will propagate to future versions for your colleagues, once you have commit
ted your changes after pull
ing an older version.
Best Practices
- This cannot be understated: WORK THROUGH THE NEXT TUTORIAL ON BRANCHES
- Do not
push
everycommit
to the remote repository. Docommit
s regularly to your local repository, but onlypush
them, once a significant amount of changes has been made. checkout
andpull
are not meant for jumping back and forth between different versions of code.checkout
should be used sparsely, whilepull
is intended to get your local repository up-to-date with other's contributions to the remote repository.commit
messages should be used in imperative form ("add" instead of "added" or "Write new method for..." instead of "Wrote new method for..."). However, this is just a convention.
Glossary
add
: Keeps track of what should be adjusted in the local repository with your nextcommit
.commit
: Files marked for tracking (aka staging) withadd
will be adjusted in the current branch of your repository (other branches can be specified).push
: Take your most recentcommit
and ask for your changes to be integrated into the most recent remote repository version of the desired branch. Usually, when collaborating on a project, this does not mean that the remote version will be identical to yours afterwards, because others have been pushing their changes, too.checkout
: Adjust files in your current directory to an older version (or a different branch).pull
: Similar tocheckout
, but you access a version from a remote repository. Usually requires somemerge
adjustments (which have not been covered, yet).
Useful Links
- Interactive GitHub course: uploading your project to GitHub