This semester we are using Github for distributing and collecting your assignments. This guide will walk you through how you may obtain the skeleton code, keep track of your progress, submit your assignment, and obtain solutions.
To clone files from Github through SSH, you need to add an SSH key to your Github account. You can choose to use the same key that you use to SSH into your VM provided that you have set up SSH forwarding as detailed in the VM SSH guide.
You can also generate a new key on your VM following Github’s guide to generate a new key.
In either case, make sure to add the key to your account following Github’s SSH guide.
To obtain the skeleton files that we have provided for you, you need to clone your private repository to your local machine. Your repository page should have a link titled “clone or download” – copy the link from there.
For an individual assignment, you may do the following:
$ git clone git@github.com:columbia-os-hw/hw<num>-<github-handle>.git
For a group assignment, you may do the following:
$ git clone git@github.com:columbia-os-hw/hw<num>-<team-num>-<team-name>.git
You may then move into the homework directory and begin your work.
To keep track of your work, we encourage you to commit your changes incrementally to keep track of your progress. If you have not used Git before, you should learn it quickly. Jae’s 3157 tutorial should get you started.
After you commit, you can push your changes to the master
branch of your
repository with the following:
$ git push origin master
With group assignments, we recommend that you push work to branches first, and
then merge back into master
once your group members have reviewed the code. As
an example, suppose that you are working on a part of the assignment, you can
create a branch separate from master
by doing the following:
$ git checkout master
$ git checkout -b <branch-name>
You can then commit your changes, and push to the branch by doing the following:
$ git push origin <branch-name>
This will allow multiple members of the team to work on separate features in
parallel. When the feature you are working on is complete, you may then create a
pull request to allow your team members to review the code, and finally merge
the changes back into master
. You can read more about using branches and pull
requests from GitHub’s own documentation.
To hand in your assignment, you will create and push one or more Git tags.
Tags point to specific commits in a Git repo’s commit history. Usually they are
used to mark release points in a project; in our case, we are using them to mark
the completion of (a part of) an assignment. We will specify what you should
name your tags in each assignment. For example, HW3 might ask you to push a tag
named hw3handin
. To create a tag, you should do the following:
$ git tag -a -m "Completed <homework/part>." <tag-name>
$ git push origin master
$ git push origin <tag-name>
You should verify that you are able to see your final commit and your
<tag-name>
tag on your Github repository page for this assignment.
To view your tags in the current repo, you can simply run
$ git tag
To view a specific tag, run
$ git show <tag-name>
If you made a mistake and wish to resubmit your assignment, you can do the following:
$ git push --delete origin <tag-name>
$ git tag --delete <tag-name>
You may then repeat the submission process. You are encouraged to resubmit as often as necessary to perfect your submission. As always, your submission should not contain any binary files.
We will also be distributing solutions via GitHub. You will be added to each
homework’s skeleton repo as a read-only collaborator, and some time after the
homework deadline has passed, we will add a solutions
branch to that skeleton
repo. You may checkout that branch to view the solution.
The URL for the skeleton repo will simply be (with <num>
substituted for the
homework number):
https://github.com/columbia-os-hw/hw<num>
You can clone this, but especially for the later kernel assignments, this can be huge, and take longer than it really needs to. Instead, you can add it to your own local repo as a remote. First, navigate to your own local repo, and run:
$ git remote add skel git@github.com:columbia-os-hw/hw<num>
This will add that URL as a remote named skel
.
Remotes are named URLs from which you can fetch and push commits, tags, and branches. This is usually useful for workflows where you pull from one remote (for example, an upstream open source project), and push to another (for example, your own fork on GitHub, from which you can make pull requests). To list your remotes, run:
$ git remote -v
Now that skel
is added as a remote, we can fetch the latest commits, and see
what branches are on that remote with the following:
$ git fetch skel
$ git branch -a
The -a
flag tells the branch
command to list all branches, even those on
remotes like skel
. Once the solutions are released, you should see one called
remotes/skel/solutions
.
You can check out those solutions to view them with the following command:
$ git checkout remotes/skel/solutions
Now you can look around, modify, and build the solution code.
This section builds on the information presented in Keeping Track of Your Work and presents more advanced Git features and workflows.
A .gitconfig
file sets configuration values for Git. For example, in the
.gitconfig
file in your home directory, you may see the following:
[user]
name = <your-user-name>
email = <your-email>
Here, “name” and “email” are the variables, and “user” is the subsection they
fall under. In addition to your name and email, your .gitconfig
can also
define other settings, such as merge tools or color themes.
We can also include Git aliases in our .gitconfig
, which are shortcuts for
commonly used Git commands.
Here are a few common aliases you may find helpful:
[alias]
co = checkout
br = branch
ci = commit
st = status
unstage = reset HEAD --
undo = reset HEAD~1 --mixed
dc = diff --cached
The shorter variables on the left can now replace the longer commands on the right. For example, to check out a branch, you can just run:
$ git co <branch-name>
If you want these aliases to apply across repositories, you would need to put
these aliases in the global .gitconfig
file in your home directory. If you
want them to be repository-specific, then you would edit the .git/config
file
found in the root of your repository.
To edit your .gitconfig
, you can open the config file with your favorite text
editor, or you can run the following command:
$ git config <mode> <subsection>.<variable> <variable-value>
The mode here can either be --global
or --local
.
For example, running the following command defines a global alias for git
checkout
:
$ git config --global alias.co checkout
A .gitignore
file specifies untracked files that Git should ignore.
Specifically, it contains a list of patterns. For example, a .gitignore
with
*.o
would indicate that Git should ignore all files ending in .o
. Note that
the patterns have a well-defined format with many features. You can read more
about this format in Git’s documentation.
You should be very precise about what you put in your .gitignore
file: Only
list files that you are sure you do not want to track and commit. A file listed
in .gitignore
will not appear when you run git status
and will not be added
to the staging area when you run git add .
. To stage an ignored file, you
would need to use the -f
flag to force its staging.
.gitignore
files can also be local or global. Local .gitignore
s can be
located anywhere within a Git repo. The files specified in a local .gitignore
will only be ignored when running Git within the repository. If a local
.gitignore
is put in a subdirectory of a Git repo, its rules will apply only
to files in that subdirectory.
A global .gitignore
specifies patterns to be ignored across all
repositories. You can configure a global .gitignore
file by running the
following command:
$ git config --global core.excludesfile <path-to-global-gitignore-file>
Note that the files in a local .gitignore
should be project-specific. In
contrast, the files in a global .gitignore
should be user-specific. In your
global .gitignore
, you may include files that your OS or text editor
generates, so you wouldn’t have to append them to every repository’s
.gitignore
.
For example, the Linux kernel’s .gitignore
ignores *.o
files and other compilation byproducts, but doesn’t include text
editor-generated files, such as Vim swap files, since the text editor used is
user-specific.
This separation of concerns allows for much cleaner .gitignore
files –
otherwise, every repo’s .gitignore
would need to contain entries corresponding
to every OS or text editor.
You can learn more about .gitignore
files by following
Github’s guide.
Our assignments have many parts, with each part implementing features on top of the previous parts. If you want to implement a new feature while leaving the old version intact, the best way to do so is to use branches.
If you want to create a new branch, first check out the existing branch you want to base your new branch off of.
$ git checkout <base-branch-name>
Then, to create and check out a new branch, run:
$ git checkout -b <base-branch-name>
Now you are working off this new branch. Any commits you make will be based on this branch, and will not be reflected in other branches.
To switch between branches, use git checkout
:
$ git checkout <branch-name>
After you create branches for different parts of an assignment, you may need to merge branches if you made changes to an older branch and would like to propagate these changes to a newer branch. You may first check out the newer branch that you want to merge the changes into, and then merge with the older branch.
For example, if you made changes to your hw1part1
branch and would like to
propagate these changes to your hw1part2
branch, you should first check out
the newer branch (hw1part2
) and then merge the older branch (hw1part1
) into
the newer branch:
$ git checkout hw1part2
$ git merge hw1part1
Note that when you attempt to merge, Git may alert you of a merge conflict if your branches have conflicting changes.
In this case, open the file with the merge conflicts, fix said conflicts, and stage and commit the file again. Check out this tutorial on how to fix merge conflicts.
You can also propagate changes between branches via a GitHub pull request, which is a request to merge changes from one branch into another.
To open a pull request on GitHub, you would navigate to the target repository,
click on the Pull Requests
tab on top, and click on New pull request
. Github
will then ask for a base branch and a compare branch. Here, the base branch is
the branch you want to merge the changes into, and the compare branch is the
branch in which you have made the changes.
After making the pull request, you and your collaborators can see the files changed and the commits made. You can also make comments on the changes. At this point, nothing has been merged.
When you and your collaborators are ready to merge the branches, you can scroll
to the bottom of the pull request and choose Merge pull request
.
When working with branches, one command you may find helpful is git
cherry-pick
. git cherry-pick
allows you to take a commit from one branch and
apply it to another.
For example, if you fixed an issue from part 1 while working in the hw1part2
branch, and would like to propagate it to the hw1part1
branch, you can do the
following to apply the specific commit from hw1part2
to hw1part1
without
merging all the changes:
First, make sure you are on the hw1part1
branch:
$ git checkout hw1part2
Use git log
to find the commit you want, and copy its commit id:
$ git log
Go to the branch you want to apply the commit to and cherry-pick the commit:
$ git checkout hw1part1
$ git cherry-pick <commit-id>
After cherry-picking, you also may need to resolve merge conflicts in the branch you have applied the commit to.
git cherry-pick
is a very powerful tool. You can read its
documentation to learn more about its functionality.
git stash
saves your local changes and reverts your working directory to your
most recent commit. This can be especially useful when you want to merge two
branches or switch to another branch, as both commands require a clean working
directory. git stash
is also useful if you want to temporarily get rid of your
local modifications, but save them for later.
Suppose you have uncommitted changes in your current working directory, but you would like to switch branches without committing your code. You can do the following:
First stash your changes in your current working directory:
$ git stash
Now, running git status
will show no uncommitted changes, so you can go ahead
and switch branches. When you’re ready to switch back, you can see a list of
your stashed changes by running:
$ git stash list
The stashes will be named something like stash@{<some-number>}
. Pick the stash
you want, and run:
$ git stash apply <stash-name>
To apply the most recent stash, you can just run:
$ git stash apply
Or you can run:
$ git stash pop
which applies the most recent stash and removes it from the stash list.
When using git add
, you stage all of the changes within the file(s) specified.
With the --patch/-p
option, Git allows you to look at each change and decide
whether you want to stage it. This allows you to make clean, focused commits,
which is a very important aspect of working in a large codebase and working with
collaboraters.
This guide goes more in depth on how to use this option. In this
guide, they use the git add -i
command and select the option p
, which is the
equivalent of git add --patch/-p
.
--patch
offers a lot of options for what to do with each hunk of changes,
including allowing you to split up or edit hunks of your changes before staging
them. This may come with a bit of a learning curve, but once you remember a few
of the options, you will find this command to be a very handy tool.
The git revert
command undoes a commit in the repository’s Git history, and
creates a new commit that records the undone changes. After you run git revert
on a specific commit, the reverted commit will still exist in the Git history,
but the state of your files will no longer reflect changes made from the
reverted commit.
This guide goes more depth into this command.
Rebasing is an advanced Git technique that allows you to copy a sequence of commits from one “base” to another. It can be used to artifically create a linear Git history, clean up a messy commit history after the fact, and is frequently used as an alternative to merging branches, due to its ability to rewrite the commit history.
git rebase
is a very powerful operation and can be potentially destructive.
As such, you should only use it once you understand how it works. As a starting
point, we recommend reading this guide and
this guide from the Git documentation.
Some time after the end of each semester, we will remove all of your repos (we will tell you when this will take place via the listserv). Before this happens, you may want to back up your repo, so that you have a copy of your hard work!
Since all of the assignment repos we have created for you are private, you should not be able to fork them as you would with a public GitHub repo. Instead, you may import the repo into your own personal GitHub account, using GitHub’s Importer tool:
You should just use your team repo’s URL as the clone URL. For example, if
you are importing HW7 and your team is 001-bulbasaur
, you should specify:
https://github.com/cs4118-hw/hw7-001-bulbasaur
.
Make sure you import your work as a Private repo. It is your respnsibility to ensure that your work is not publicly accessible, and maintain the integrity of this course and its course materials. Please refer to our academic honesty policy for more details.
When it asks for your old project’s credentials, just put in your GitHub login details.
Importing each repo usually takes a while, but will continue to make progress even after you leave GitHub/close the web page. GitHub will send you an email once it has finished.
If you’re importing multiple repos, you should be able to launch multiple import jobs in parallel.
N.B.: As of January 2019, GitHub has announced that they will be offering unlimited private repos for even their free tier, so you shouldn’t run into issues with having only a limited number of private repos.