From my previous post, we found out that there is a clear difference in the fractional splits pace for male and female.
The question I want to address now is whether the splits provide a good way to predict the gender of a runner from the split times.
To approach the question, a preliminary data analysis as the one presented before suggest that data from splits in the 5, 10, 15, 20, 25, 30, 35, 40K checkpoints and the total time provide enough information to attempt a regression problem where the target variable is binary (1 for male runner and 0 for female)
I started with the entire set of runners in NYC 2011 marathon considering a random sampling for the training set and the test set with a few algorithms. Here I will present the results with k-NN
k-NN:
Two heads are better than one and k heads are even better than one.
The k-NN relays in the fact that locally, the best decision can be made when the majority of your friends agrees on something. Many assumptions are made to get to this claim but it seems very reasonable. One important thing to consider here is the meaning of close friends. How do you determine who are your closest friends. It is clear that this question lies beyond the geographical sense of the world, and it requires a different way to measure distances, that is a metric.
In the case in hands, the input data will be the split times and the labels (1 for male and 0 for female).
Many metrics where implemented, among them euclidean, Manhattan, Dot product
Thursday, May 22, 2014
Wednesday, May 21, 2014
Keeping a constant pace
Some people might consider running "boring". But there is some strategy involved. And the pace is the key parameter to consider.
In an event like a marathon, keeping a constant pace is definitely a challenge and those who manage to maintain the same pace throughout the whole race seem to perform better.
In most cases the first half of the race shows a faster performance. This motivates a measurement of the asymmetry that depends on the difference between the first half and the second over the total time. For instance, if the pace was constant the asymmetry factor will be zero. Two extreme cases can be considered: if the entire race was performed in the first half of the time and the other way around.
With that definition, most of the athletes will have positive values of the asymmetry factor while the elite runners will swarm around zero.
With that definition, most of the athletes will have positive values of the asymmetry factor while the elite runners will swarm around zero.
In the figure above, the values of asymmetry factor are shown for male and female athletes in NYC Marathon 2011.
There is also a plateau in the AsymFactor near 5h which suggests that after that time the all the athletes will have a bad second half regardless of the time.
The evolution of the pace over a race can be tracked using the checking points at 5, 10, 15, 20, 25, 30, 35 and 40 km. The fraction of the pace at any given check point over the average pace, - denoted Fractional Pace - can provide a lot of information about the performance.
Below the distribution of the fractional pace for male and female athletes for several checkpoints
It is interesting to note the two-peak structure in the distribution for 5km. The narrow peak, corresponding to paces closer to the mean value are typical of elite athletes.
The distribution for 35km shows a displacement towards larger fractional paces.
The peaks of the distribution for men and women are shifted. In the case of women, at 35K the pace is somewhat faster than the mean value during the race.
As a conclusion, as the race evolves the capability to maintain constant paces seems to be determinant in the performance of the runner. The best strategy will be then choose a pace that allows the runner to run evenly throughout the race.
As a conclusion, as the race evolves the capability to maintain constant paces seems to be determinant in the performance of the runner. The best strategy will be then choose a pace that allows the runner to run evenly throughout the race.
Subscribe to:
Comments (Atom)


