Git

Introduction

Git is a version control system that helps developers track changes to their code over time. It allows multiple people to work on the same codebase simultaneously and provides features for branching, merging, and resolving conflicts. Git is widely used in software development and is hosted on platforms like GitHub and GitLab.

For people who will work in a large collaboration, it is strongly recommended to work with git.

Basic git commands

Listing the files that have changed compared to the last commit.

git status

Showing the actual difference compared to the last commit.

git diff

Showing the difference in a specific file.

git diff <file>

Listing branches of the git repository

# This shows local branches only
git branch
# This shows remote branches as well
git branch -a

Creating a new branch

git checkout -b <new branch name>

Switching to another existing branch

git checkout <branch>

Showing the commit history of the git repo.

git log
# showing with graph (shows how branches are merged visually etc)
git log --graph

Getting back to a specific commit in the past

git checkout <commit hash>
# getting old commit as a new branch
git checkout -b <branch name> <commit hash>

where <commit hash> is a hash to specify a commit that can be listed with git log.

Updating the local git repository

The process of updating the git repository works in two steps. We first put the file changes onto the stage where the changes that we want to apply to the git repository are prepared, and this operation is called add. The second step is to save the changes on the stage to the actual git repository, this is called commit. The typical workflow would be

  1. check which files are changed with git status
  2. add the files you want to save to the git repository.
  3. once you finalized the files to add, you can commit the changes to the local git repository. The command to add the files are
    # add one specific file
    git add <file>
    # add files that are tracked by git and that have changed compared to the last commit (this is most useful for daily development)
    git add -u
    

    The command to commit the changes on the stage to repo.

    git commit -m'your commit comment'
    

    You can get a file back to what used to be at the time of the last commit. Note that this can be used only before you commit.

    git checkout <filename>
    

    Note that the above commands work for the branch you are currently working on. So, if you want to change the branch you want to commit, you first need to change the branch with git checkout <branch> and then use the above commands.

The actual git repository can be found in a hidden directory named .git. You can see the content with the following command, but it is strongly advised not to modify any content in it by hand.

ls .git

Synchronizing with remote repo

The commands in the last section are for updating the local git repository. You still need to upload the change of the local to the remote synchronized repository on e.g. GitHub. You can upload your changes of the local git repository to the remote repository (this is called push in git terminology).

git push origin <branch name>

The command to newly get a remote branch.

git checkout -b <branch> origin/<branch>

Inversely, the command to push a new local branch to the remote branch.

git push -u origin <branch>

Deleting a local branch that is no longer needed.

git branch -d <branch>

gitignore

In a git-managed directory, git will look at the difference of all the files under the directory. However, it is usual that some data should be tracked by git but others should not. For example, a file that is specific to one’s local environment, a big data file, .DS_Store automatically generated by MacOS, __pycache__, and so on. These files should be ignored, otherwise, the git status command shows a huge list of files that should not be tracked by git and makes it harder for you to find which files are changed. This can be solved by setting up a so-called .gitignore file, which lists which files/directories and which type of files/directories should be ignored by git. If your repository includes python codes, then you can for example use this .gitignore file: gitignore template for python form GitHub. You can also set up a global gitignore file, which is applied over your computer system.

git config

Setting configuration in Git is important because it helps to personalize and optimize the behavior of Git for each user’s specific needs. This includes setting the user’s name and email for authoring commits, configuring the default text editor for commit messages, specifying aliases for frequently used commands, and more.

By setting configuration options, developers can streamline their workflow, reduce errors, and ensure that their commits are attributed correctly. It also helps to ensure consistency across different machines and environments, which is especially important when working collaboratively on a codebase.

Starting a new repo with GitHub

Human easily forgets things that do not happen daily basis… Here is an instruction copied from GitHub when we initialize a remote repo. …or create a new repository on the command line

echo "# analysis" >> README.md
git init
git add README.md
git commit -m "first commit"
git branch -M main
git remote add origin https://github.com/HSCM31microlensing/analysis.git
git push -u origin main

…or push an existing repository from the command line

git remote add origin https://github.com/HSCM31microlensing/analysis.git
git branch -M main
git push -u origin main