Search engine optimization with git web interfaces

I recently became frustrated with gitweb’s funky query-strings and decided to give cgit a try. Although there are some patches that make gitweb more user (and search engine) friendly, cgit is a much better web-interface for git, both in terms of the code and the actual user experience. However, there were still some opportunities for SEO.

I went through the HTML suggestions from the google webmaster tools and Google’s own SEO Starter Guide. I’ve pushed the search engine optimized cgit to my seo branch on github. You can see it in action at my git repositories. I’m testing all of this using an Apache ScriptAlias directive, I’m hoping it will still work alright with whatever other URL-processing schemes cgit supports. A short summary of the new SEO features so far:

  • Use HTML h1 and h2 heading tags instead of custom-styled divs
  • Much better title tags; commits have the commit subject, and the repo name has been added in a lot of places to avoid duplicate titles
  • The bread-crumb has been integrated into the heading
  • A configurable option to set nofollow relationships on links to non-HEAD commits, to avoid duplicate content being indexed

Of course, you could take the popular option of just using github instead of self-hosting your own git web interfaces… but even they don’t do quite a good a job IMO, they use the SHA1 in the web page titles, eww!

iTerm+blur: updated; thanks git

Update: The patch has been merged into upstream CVS. Yay!

I’ve merged the current CVS HEAD (sub-minor version 0821 apparently) of iTerm with my +blur branch. You can download the new binary.

In doing so, I found that there’s a new maintainer, James Bunton, responsible for most of the changes. He’s got a mercurial tree, but it’s not up to date with sourceforge CVS. Welcome to distributed VC hell, where CVS is being used as a central repository between patchset-based DVCs. There are some importers, but if the canonical repository is CVS, importing from the mercurial tree is strictly more work (since I’d still need to import the CVS updates, and merge again).

cvsps and git-cvsimport do a good job of making working with cvs repositories from git reasonably painless. But I had some issues.

Perhaps it was just me failing, but I couldn’t coax git-cvsimport to import into my remotes (yes, using -r). Also it would be nice if git pull could figure out a remote was CVS and run git-cvsimport for me, using cached parameters. As it is, I have one repository for running git-cvsimport, and my working repository that pulls from that.

Git seems remarkably unhelpful when it comes time to do manual merging. An “ours”-type strategy for hunks consisting of just CVS $Id$ keywords would be nice. Why isn’t there a simple comand to run a 3-way merge with an arbitrary merge(1)-compatible invocation? I discovered smartmerge too late, but surely manual merges are common enough that getting 3 temporary files together could be done in one base command?

There’s a persistent feeling when using git that I’m doing it wrong, or at least there’s an easier way if only I could remember the command. I ran git status |grep unmerged because I couldn’t remember that I wanted git ls-files -u. I have to re-read git-rebase’s documentation every time I use it.

Update: the ident attribute deals neatly with CVS $Id$ keywords.
Update: the git mergetool command can be used instead of smartmerge.