Git basics (command line)
Essentially, all version control does is store snapshots of your work in time, and keeps track of the parent-child relationship.
You can think of your current set of working files are simply the child of the last node in this chain (that is, your files are the children of the most recent set of files known to the version control system).
git
provides a large number of tools for manipulating this history.
We’ll only touch on a few, but the number that you need to know for
day-to-day use is actually quite small.
We’re going to need some terminology:
- Repository: “Repo” for short; this is a copy of the history of your project. It is stored in a hidden directory within your project. People will talk about “cloning a repo” or “adding things to a repo”; these all manipulate the history in some way.
- Working directory: This is your copy of a project. It’s just a
directory with files and other directories in it. The repository is
contained within a
.git
directory at the root directory of your project.
Creating a repository
If you want to create a repository from the command line, use the command
1
|
|
which will print something like
1
|
|
I have deleted the .git
directory in the vc
project, and
re-initialised an empty git repository there.
The add-commit cycle, revisited
We will use a few commands.
The first is git status
. This tells you the status of all the files
in your project that are not up to date. At the moment, it
contains:
1 2 3 4 5 6 7 8 9 10 11 |
|
which is essentially the same information as
The command git add
does (essentially) the same thing as clicking
the “Staged” checkbox in Rstudio:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
This tells us all of the things that we are going to commit
(script.R
) and the files that git does not know about (.gitignore
and vc.Rproj
). The command git commit
does the actual addition.
The -m
option passes in a message for the commit.
1
|
|
which prints
1 2 3 |
|
which is essentially the same information that RStudio showed after committing.
We can add the other files:
1 2 |
|
which will print
1 2 3 4 |
|
To see the history
1
|
|
which will print something like
1 2 3 4 5 6 7 8 9 10 11 |
|
What is going on with those crazy strings of numbers?
You may have noticed the long strings of numbers, such as:
1
|
|
These are called “hashes”; think of them as a fingerprint of a file, or of a commit. Git uses them everywhere, so these get used where you would otherwise use “version1”, or “final”, etc.
The nice thing about them is that they depend on the entire history of
a project, so you know that your history is secure. For example, I
deleted the full stop at the end of the first commit message
(don’t ask me how) and reran
git log
1 2 3 4 5 6 7 8 9 10 11 |
|
You might expect that the hash for the first commit would change, but notice that it is has changed a lot for just one character difference. Also notice that the second commit has a new hash too; this is because one of the “things” in the second commit is a pointer back to the first commit indicating who its parent is.
Confused? Don’t worry. All you need to know is that the hash identifies your entire project including its history, and that if anything changes anything in the project, the hashes will change. This is great because it allows us to use the big ugly strings of letters and numbers as a shortcut for a very precise set of information.
What changed?
The other thing that we could do in RStudio is see the lines of code that changed. This is incredibly useful, and once you start thinking with version control you’ll constantly look to see what has changed. The confidence that you can always go back is what makes version control empowering.
Suppose we change the script.R
file again:
1 2 3 4 5 |
|
we’ll see that git status
reports that the file has changed:
1 2 3 4 5 6 7 8 |
|
The command git diff
shows the change between the contents of the
working directory and the changes that would be commited. So with
nothing to commit, this is the difference between the files in the
directory and the last revision. Running git diff
reports:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
if we add the file to “stage” it with:
1
|
|
and rerun git diff
, there is no output. The command git status
now reports
1 2 3 4 5 6 |
|
indicating that script.R
will be added when we do git commit
. You
can review what would be commited line-by-line by running
1
|
|
which compares the contents of the staged changes with the previous version.