dev-guides

GitHub Workflow

This semester we are using Github for distributing and collecting your assignments. This guide will walk you through how you may obtain the skeleton code, keep track of your progress, submit your assignment, and obtain solutions.

SSH Keys for Github

To clone files from Github through SSH, you need to add an SSH key to your Github account. You can choose to use the same key that you use to SSH into your VM provided that you have set up SSH forwarding as detailed in the VM SSH guide.

You can also generate a new key on your VM following Github’s guide to generate a new key.

In either case, make sure to add the key to your account following Github’s SSH guide.

Obtaining Skeleton Files

To obtain the skeleton files that we have provided for you, you need to clone your private repository to your local machine. Your GitHub Classroom repository page should have a link titled “clone or download” – copy the link and git clone from there.

You may then move into the homework directory and begin your work.

Keeping Track of Your Work

To keep track of your work, we encourage you to commit your changes incrementally to keep track of your progress. If you have not used Git before, you should learn it quickly. Jae’s 3157 tutorial should get you started.

After you commit, you can push your changes to the master branch of your repository with the following:

$ git push origin master

With group assignments, we recommend that you push work to branches first, and then merge back into master once your group members have reviewed the code. As an example, suppose that you are working on a part of the assignment, you can create a branch separate from master by doing the following:

$ git checkout master
$ git checkout -b <branch-name>

You can then commit your changes, and push to the branch by doing the following:

$ git push origin <branch-name>

This will allow multiple members of the team to work on separate features in parallel. When the feature you are working on is complete, you may then create a pull request to allow your team members to review the code, and finally merge the changes back into master. You can read more about using branches and pull requests from GitHub’s own documentation.

Submission

To hand in your assignment, you will create and push one or more Git tags. Tags point to specific commits in a Git repo’s commit history. Usually they are used to mark release points in a project; in our case, we are using them to mark the completion of (a part of) an assignment. We will specify what you should name your tags in each assignment. For example, HW3 might ask you to push a tag named hw3handin. To create a tag, you should do the following:

$ git tag -a -m "Completed <homework/part>." <tag-name>
$ git push origin master
$ git push origin <tag-name>

You should verify that you are able to see your final commit and your <tag-name> tag on your Github repository page for this assignment.

To view your tags in the current repo, you can simply run

$ git tag

To view a specific tag, run

$ git show <tag-name>

If you made a mistake and wish to resubmit your assignment, you can do the following:

$ git push --delete origin <tag-name>
$ git tag --delete <tag-name>

You may then repeat the submission process. You are encouraged to resubmit as often as necessary to perfect your submission. As always, your submission should not contain any binary files.

Viewing Solutions

We will also be distributing solutions via GitHub. You will be added to each homework’s skeleton repo as a read-only collaborator, and some time after the homework deadline has passed, we will add a solutions branch to that skeleton repo. You may checkout that branch to view the solution.

The URL for the skeleton repo will simply be (with <num> substituted for the homework number):

https://github.com/columbia-os-hw/hw<num>

You can clone this, but especially for the later kernel assignments, this can be huge, and take longer than it really needs to. Instead, you can add it to your own local repo as a remote. First, navigate to your own local repo, and run:

$ git remote add skel git@github.com:columbia-os-hw/hw<num>

This will add that URL as a remote named skel.

Remotes are named URLs from which you can fetch and push commits, tags, and branches. This is usually useful for workflows where you pull from one remote (for example, an upstream open source project), and push to another (for example, your own fork on GitHub, from which you can make pull requests). To list your remotes, run:

$ git remote -v

Now that skel is added as a remote, we can fetch the latest commits, and see what branches are on that remote with the following:

$ git fetch skel
$ git branch -a

The -a flag tells the branch command to list all branches, even those on remotes like skel. Once the solutions are released, you should see one called remotes/skel/solutions.

You can check out those solutions to view them with the following command:

$ git checkout remotes/skel/solutions

Now you can look around, modify, and build the solution code.

Enhancing Your Git Workflow

This section builds on the information presented in Keeping Track of Your Work and presents more advanced Git features and workflows.

.gitconfig

A .gitconfig file sets configuration values for Git. For example, in the .gitconfig file in your home directory, you may see the following:

[user]
	name = <your-user-name>
	email = <your-email>

Here, “name” and “email” are the variables, and “user” is the subsection they fall under. In addition to your name and email, your .gitconfig can also define other settings, such as merge tools or color themes.

We can also include Git aliases in our .gitconfig, which are shortcuts for commonly used Git commands.

Here are a few common aliases you may find helpful:

[alias]
	co = checkout
	br = branch
	ci = commit
	st = status
	unstage = reset HEAD --
	undo = reset HEAD~1 --mixed
	dc = diff --cached

The shorter variables on the left can now replace the longer commands on the right. For example, to check out a branch, you can just run:

$ git co <branch-name>

If you want these aliases to apply across repositories, you would need to put these aliases in the global .gitconfig file in your home directory. If you want them to be repository-specific, then you would edit the .git/config file found in the root of your repository.

To edit your .gitconfig, you can open the config file with your favorite text editor, or you can run the following command:

$ git config <mode> <subsection>.<variable> <variable-value>

The mode here can either be --global or --local.

For example, running the following command defines a global alias for git checkout:

$ git config --global alias.co checkout

.gitignore

A .gitignore file specifies untracked files that Git should ignore. Specifically, it contains a list of patterns. For example, a .gitignore with *.o would indicate that Git should ignore all files ending in .o. Note that the patterns have a well-defined format with many features. You can read more about this format in Git’s documentation.

You should be very precise about what you put in your .gitignore file: Only list files that you are sure you do not want to track and commit. A file listed in .gitignore will not appear when you run git status and will not be added to the staging area when you run git add .. To stage an ignored file, you would need to use the -f flag to force its staging.

.gitignore files can also be local or global. Local .gitignores can be located anywhere within a Git repo. The files specified in a local .gitignore will only be ignored when running Git within the repository. If a local .gitignore is put in a subdirectory of a Git repo, its rules will apply only to files in that subdirectory.

A global .gitignore specifies patterns to be ignored across all repositories. You can configure a global .gitignore file by running the following command:

$ git config --global core.excludesfile <path-to-global-gitignore-file>

Note that the files in a local .gitignore should be project-specific. In contrast, the files in a global .gitignore should be user-specific. In your global .gitignore, you may include files that your OS or text editor generates, so you wouldn’t have to append them to every repository’s .gitignore.

For example, the Linux kernel’s .gitignore ignores *.o files and other compilation byproducts, but doesn’t include text editor-generated files, such as Vim swap files, since the text editor used is user-specific.

This separation of concerns allows for much cleaner .gitignore files – otherwise, every repo’s .gitignore would need to contain entries corresponding to every OS or text editor.

You can learn more about .gitignore files by following Github’s guide.

Git branches

Our assignments have many parts, with each part implementing features on top of the previous parts. If you want to implement a new feature while leaving the old version intact, the best way to do so is to use branches.

If you want to create a new branch, first check out the existing branch you want to base your new branch off of.

$ git checkout <base-branch-name>

Then, to create and check out a new branch, run:

$ git checkout -b <base-branch-name>

Now you are working off this new branch. Any commits you make will be based on this branch, and will not be reflected in other branches.

To switch between branches, use git checkout:

$ git checkout <branch-name>

After you create branches for different parts of an assignment, you may need to merge branches if you made changes to an older branch and would like to propagate these changes to a newer branch. You may first check out the newer branch that you want to merge the changes into, and then merge with the older branch.

For example, if you made changes to your hw1part1 branch and would like to propagate these changes to your hw1part2 branch, you should first check out the newer branch (hw1part2) and then merge the older branch (hw1part1) into the newer branch:

$ git checkout hw1part2
$ git merge hw1part1

Note that when you attempt to merge, Git may alert you of a merge conflict if your branches have conflicting changes.

In this case, open the file with the merge conflicts, fix said conflicts, and stage and commit the file again. Check out this tutorial on how to fix merge conflicts.

Pull requests

You can also propagate changes between branches via a GitHub pull request, which is a request to merge changes from one branch into another.

To open a pull request on GitHub, you would navigate to the target repository, click on the Pull Requests tab on top, and click on New pull request. Github will then ask for a base branch and a compare branch. Here, the base branch is the branch you want to merge the changes into, and the compare branch is the branch in which you have made the changes.

After making the pull request, you and your collaborators can see the files changed and the commits made. You can also make comments on the changes. At this point, nothing has been merged.

When you and your collaborators are ready to merge the branches, you can scroll to the bottom of the pull request and choose Merge pull request.

git cherry-pick

When working with branches, one command you may find helpful is git cherry-pick. git cherry-pick allows you to take a commit from one branch and apply it to another.

For example, if you fixed an issue from part 1 while working in the hw1part2 branch, and would like to propagate it to the hw1part1 branch, you can do the following to apply the specific commit from hw1part2 to hw1part1 without merging all the changes:

First, make sure you are on the hw1part1 branch:

$ git checkout hw1part2

Use git log to find the commit you want, and copy its commit id:

$ git log

Go to the branch you want to apply the commit to and cherry-pick the commit:

$ git checkout hw1part1
$ git cherry-pick <commit-id> 

After cherry-picking, you also may need to resolve merge conflicts in the branch you have applied the commit to.

git cherry-pick is a very powerful tool. You can read its documentation to learn more about its functionality.

git stash

git stash saves your local changes and reverts your working directory to your most recent commit. This can be especially useful when you want to merge two branches or switch to another branch, as both commands require a clean working directory. git stash is also useful if you want to temporarily get rid of your local modifications, but save them for later.

Suppose you have uncommitted changes in your current working directory, but you would like to switch branches without committing your code. You can do the following:

First stash your changes in your current working directory:

$ git stash

Now, running git status will show no uncommitted changes, so you can go ahead and switch branches. When you’re ready to switch back, you can see a list of your stashed changes by running:

$ git stash list

The stashes will be named something like stash@{<some-number>}. Pick the stash you want, and run:

$ git stash apply <stash-name>

To apply the most recent stash, you can just run:

$ git stash apply

Or you can run:

$ git stash pop

which applies the most recent stash and removes it from the stash list.

git add –patch/-p

When using git add, you stage all of the changes within the file(s) specified. With the --patch/-p option, Git allows you to look at each change and decide whether you want to stage it. This allows you to make clean, focused commits, which is a very important aspect of working in a large codebase and working with collaboraters.

This guide goes more in depth on how to use this option. In this guide, they use the git add -i command and select the option p, which is the equivalent of git add --patch/-p.

--patch offers a lot of options for what to do with each hunk of changes, including allowing you to split up or edit hunks of your changes before staging them. This may come with a bit of a learning curve, but once you remember a few of the options, you will find this command to be a very handy tool.

git revert

The git revert command undoes a commit in the repository’s Git history, and creates a new commit that records the undone changes. After you run git revert on a specific commit, the reverted commit will still exist in the Git history, but the state of your files will no longer reflect changes made from the reverted commit.

This guide goes more depth into this command.

git rebase

Rebasing is an advanced Git technique that allows you to copy a sequence of commits from one “base” to another. It can be used to artifically create a linear Git history, clean up a messy commit history after the fact, and is frequently used as an alternative to merging branches, due to its ability to rewrite the commit history.

git rebase is a very powerful operation and can be potentially destructive. As such, you should only use it once you understand how it works. As a starting point, we recommend reading this guide and this guide from the Git documentation.

Backing Up Your Repos

Some time after the end of each semester, we will remove all of your repos (we will send an anouncement telling you when this will take place). Before this happens, you may want to back up your repo, so that you have a copy of your hard work!

Backup using GitHub

Since all of the assignment repos we have created for you are private, you should not be able to fork them as you would with a public GitHub repo. Instead, you may import the repo into your own personal GitHub account, using GitHub’s Importer tool:

You should just use your team repo’s URL as the clone URL. For example, if you are importing HW7 and your team is team1-bulbasaur, you should specify: https://github.com/cs4118-hw/s25-hw7-team1-bulbasaur.
Make sure you import your work as a Private repo. It is your respnsibility to ensure that your work is not publicly accessible, and maintain the integrity of this course and its course materials. Please refer to our academic honesty policy for more details.
When it asks for your old project’s credentials, just put in your GitHub login details.
Importing each repo usually takes a while, but will continue to make progress even after you leave GitHub/close the web page. GitHub will send you an email once it has finished.
If you’re importing multiple repos, you should be able to launch multiple import jobs in parallel.

N.B.: As of January 2019, GitHub has announced that they will be offering unlimited private repos for even their free tier, so you shouldn’t run into issues with having only a limited number of private repos.

GitHub’s Importer Tool can be a bit sensitive, so if you run into any problems using it, you can also imitate the import process by manually creating a private, empty repository on your GitHub account and pushing to it:

Clone the repository and cd into the resulting folder. An existing clone works fine too, just make sure you’ve pulled all your changes. We’ll call this one the “source” repository.
Create a new private, empty repository on your GitHub account. Initialize as private, and without any README.md. We’ll call this one the “destination” repository.

In your clone of the source repository, add the destination repository as a new remote:

$ git remote add destination git@github.com:<your github username>/<destination repository name>.git
$ git remote -v # should now display 4 lines for 2 remotes: 2 for the existing "origin" and 2 for a new "destination"

Then, just push everything to the destination repository.
```
$ git push destination main
```