Conventional assessment of route difficulty for rock climbing is a subjective process. A small number of people (often just one) assign a grade for a particular route, and there isn’t really a process for refining grades once they’ve been assigned (it’s just one opinion vs another). Most of the grading systems are on an ordinal scale, which means you can rank the grades in order but the difference or ratio between grades isn’t meaningful. Intentional biases are even part of climbing culture.
To address these shortcomings, I developed a statistical model for grading rock climbing routes. The difficulty of a climbing route and the performance of a climber on a particular day are described by numerical ratings. The difference in ratings between a climber and a route determines the probability the climber will ascend the route “successfully”. For modern sport climbing, success loosely means getting to the top without weighting a rope or other mechanical devices. The climbing model is based on a dynamic Bradley-Terry model, which is a common model for game and sports rating systems such as Elo and Glicko-2.
While the statistical model provides a theory for predicting ascent outcomes based on ratings parameters, it’s not useful in practice without a process for estimating the parameters (individual ratings for climbers and routes) and hyperparameters (generalizations that are independent of individual climbers or routes, e.g. how hard the “average” route is, and how quickly climbers can improve). So I implemented an algorithm for estimating the parameters, based on the Whole-History Rating (WHR) algorithm. WHR is a fast algorithm that uses second-order (Newton-Raphson) optimization for finding the ratings for climbers and routes that maximize the likelihood of observing a particular set of ascents (known as the maximum a posteriori estimates). I used machine learning methods to choose the hyperparameters. The implementation is available as a free, open-source software package at the Climbing Ratings project on GitHub.