I recently decided that I should probably write more. GitHub Pages is a good place where to store what I write.
What Is GitHub Pages
GitHub Pages is a service provided by GitHub to host… pages. It is
a great way to add a website to a project hosted on GitHub, and also
quite simple to do by using git. The idea is that you create a
dedicated branch in your project repository, called gh-pages
, and
put the website there. GitHub is then smart enough to take the contents
from that branch and expose them on the Internet at the right address.
If your nickname on GitHub is mynick
, and the project is called
myproject
, then:
- the project will be at
https://github.com/mynick/myproject
- its pages will be at
http://mynick.github.io/myproject
For example, my repository for the Potrace Perl bindings has:
- project repository at address
https://github.com/polettix/Graphics-Potrace
- its associated page(s) at address
http://polettix.github.io/Graphics-Potrace
Project or User/Association?
What written above is fine for projects hosted on GitHub. As a matter of fact, there is also a standardized way to have similar pages for a user or an organization.
There is a slight inconsistency in how the thing is handled though:
- it still relies on a GitHub project - good
- the GitHub project MUST have a specific name, e.g.
mynick.github.io
- still good - the pages are hosted in the
master
branch instead ofgh-pages
- this is a bummer!
Blog?
With the tools above - especially considering the pages for a user or an association - it is easy to think about hosting a blog on GitHub. We will assume that it is hosted as a project by itself, not the one for the user/association above (although you can easily tweak the instructions below to make it happen).
The basic idea is that keeping a blog’s pages is too cumbersome to be done manually. You will probably want to provide a consistent look, with all headers, navigation, sides, footers… all the bells and whistles.
One of the best approaches to take is to use some blog generation system - we’ll use Jekyll here - so that we can concentrate on writing the stuff, and let the system do the heavy lifting to generate the final pages. Hence, it makes sense to consider the blog from two points of view:
- the generating system where you put your articles in
- the final generated site
This fits perfectly with GitHub: you can keep the generating system as the project, and its associated GitHub Pages as the real blog that is served on the Internet.
Let’s Start!
I set up my blog infrastructure using Jekyll. After installing it, create your new blog like this:
jekyll new myblog
cd myblog
git init
git add .
git commit -m 'initial import'
Now you have your local repository for the blog. At this point, you
are ready for creating a new repository in GitHub (let’s call it
myblog
in user mynick
) and tie them up:
git remote add origin git@github.com:mynick/myblog.git
git push -u origin master
It’s time to start generating pages at this point. Depending on how
you installed Jekyll or whatever different, you might have to use
bundle
, which is what we will assume here:
bundle exec jekyll build
Now the generated stuff will live inside the _site
subdirectory. This
should be already included inside the .gitignore
file generated by
Jekyll automatically, but in case it’s a good moment for doing this.
The suggestion is that the master
and gh-pages
branches are really
separated from one another. Again, there might be many ways to do this,
I’m just providing you one here:
git checkout master
git checkout --orphan gh-pages
git rm -rf .
At this point you should still have the _site
directory lying around,
and this is where the real contents of your site actually are. A basic
strategy can be to just copy the contents of that directory inside the
root directory of the repository:
tar cf - -C _sites . | tar xvf -
git add .
git commit -m 'gh-pages initial import'
git push origin gh-pages:gh-pages
There you go, your blog is online!
Routine Workflow
What’s the workflow from now on? You will normally work in the master
branch - we set all this up for this reason, actually - and will switch
on the gh-pages
branch only when needed.
Adding posts or pages in Jekyll is quite easy and there is plenty of
documentation. When you’re done, make sure you are in the master
branch and that changes are committed, otherwise you will not be able
to switch to the gh-pages
branch later on. It’s OK to have files that
are not yet onboarded in GitHub though, git
will not complain about
them.
At this point, you have to follow these steps:
bundle exec jekyll build
git checkout gh-pages
tar cf - -C _sites . | tar xvf -
git add .
git commit -m $(date '+blog status at %Y%m%d-%H%M%S')
git push origin gh-pages:gh-pages
git checkout master
The copy using tar
is effective although not completely correct. In
particular, it will not take into consideration things that you delete,
because all items will be added to what is already saved and committed.
In general this should not be a problem though, because you will mostly
be adding things, will you not?
A better strategy is to use git ls-files
to list all files and remove
most of them before doing the copy with tar
. We should not get rid of
all of them though, because some might be important for the generic
management of the pages (e.g. the .gitignore
file). We will assume
that there are no files with spaces, so this will work:
bundle exec jekyll build
git checkout gh-pages
rm $(git ls-files | grep -v '^\.gitignore$')
tar cf - -C _sites . | tar xvf -
git add .
git commit -m $(date '+blog status at %Y%m%d-%H%M%S')
git push origin gh-pages:gh-pages
git checkout master
I eventually put the commands above in publish.sh
file:
#!/bin/bash
MYDIR=$(dirname "$0")
FULLME=$(readlink -f "$0")
BAREME=$(basename "$0")
die() {
echo "$*" >&2
exit 1
}
main() {
cd "$MYDIR" || die "unable to go in $MYDIR"
cd .. || die "unable to go in parent directory of $MYDIR"
echo "in $PWD now"
git checkout master || die 'unable to switch to master'
bundle exec jekyll build || die "unable to update contents"
git checkout gh-pages || die 'unable to switch to gh-pages'
tar cf - -C _site . | tar xvf - \
&& git add . \
&& git commit -m "$(date '+update at %Y%m%d-%H%M%S')" \
&& git push origin gh-pages
git checkout master || die 'unable to switch to master'
}
main
I’m not an expert on this, but it’s very probable that without resorting to
the trick of defining a function main
and calling it, things might go
very wrong in the execution of the script, because the script will live
in the master
branch but it might be unavailable in branch gh-pages
.