Monday, April 11, 2016

Show Series Stats - Math Alert!

Ok, so here's a stat post about the 6 show series that Penn and I completed!

First, some basic and straightforward information about the entire series. There were:

Here's a nifty excel chart of all of the exact scores from all 6 shows. The number of occurrences of each score is on the y-axis, the scores themselves on the x-axis. I think it's slightly visually misleading/difficult to look at.

Mean - Purple Line
Red - 1 Standard Deviation
Green - 2 Standard Deviations
Blue - 3 Standard Deviations

I think we're looking more for the bell curve itself, if it exists (which I think it does- most horses should be satisfactory, some will be worse, and some will be better). All of the "single count" dots are kind of meaningless without an even bigger data set that gives them a count of more than 1 occurrence, otherwise you're just looking at the dot density.

The normal curve for those of you who are not mathaholics.
A hump in the middle with two tails that taper off on either side.
Each side will have 50% of the data, which will be distributed in the same way on both sides.

Enter the rounded chart, where every score got rounded to the nearest whole percent. Not the best method, but my sample set isn't big enough to go to one decimal place (because that's where we were starting anyway!).

Lines are the same as the above chart.

For those interested in the exact values of the standard deviation lines:

For those not familiar with standard deviation, here's a quote stolen from Wikipedia:
In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ or s) is a measure that is used to quantify the amount of variation or dispersion of a set of data values.[1] A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.
And for those not familiar with variance, here's a quote also stolen from Wikipedia:
In probability theory and statistics, variance measures how far a set of numbers are spread out. A variance of zero indicates that all the values are identical. Variance is always non-negative: a small variance indicates that the data points tend to be very close to the mean (expected value) and hence to each other, while a high variance indicates that the data points are very spread out around the mean and from each other.

In a normal distribution (bell curve), 68% of the data should fall within one standard deviation of the mean, 95% within two standard deviations, 99.7% within three. Out of the 479 tests, only 3 fell outside the three standard deviations, which means 99.37% of the data is within three standard deviations of the mean (our data: 71.61% within 1 SD, 95.62% within 2 SD, 99.37% within 3 SD).

In a normal distribution, the mean (average), median (middle data point), and mode (most common data point) should be the same. Using our rounded values, it looks like the mean is 60.3%, median is 61.0%, mode is 63.0%. Perhaps our show data isn't normally distributed (more on that below).

Variance was smallest at the first show- that isn't surprising since that show had 9 ties and one 3-way tie throughout the day.

Variance was biggest at the 3/6 show (by a lot). The range of scores that day almost matches the ranges for every score in the series, except in a smaller data set. This show was also the one where there were a ton of scores in the 40s, many more than there should have been. While I think that judge was generous at times, I also think she was quite harsh too. Throughout the series, there were 18 scores below 50. 12 of them happened at this show and all were at the intro/training level. Only one horse in the series was consistently in the upper 40s and low 50s, and I watched him go on numerous occasions- he would fling himself across the arena, spin, not halt, not hold his gait, and being inverted was the least of his problems. I don't think Intro and Training levels should be scored that low unless they're doing what this horse was doing... especially at schooling shows.

Our data isn't perfect since it is a small data set, but I thought it would be neat to check out the standard deviation and variance. Since I went to the trouble of checking all of that, I decided to use a Normal Probability Plot to see how close to the normal curve (bell curve) our data is (the exact data, not rounded):

All of the tests scores were "normalized" (y-axis values) and they were converted to a ranked probability value between 0 and 1 based on a nifty help sheet I found online for the x-axis. The red line shows a perfectly linear progression, aka the normal curve (bell curve) when changed to the same probability values.

Our data has quite the strong s-shape, but the x-intercept is at 0.4991. I was concerned I didn't have a big enough data set, well oops! There are plenty of dots that show the normal distribution is NOT a good fit for our data! This leads me to believe that dressage score data, with thousands of scores to look at, is probably not normally distributed.

Why? Well, if you're scoring 70%+, you either had a) a fantastic day or b) are probably going to move on to the next level. If you're scoring less than 50%, something terrible happened and you either a) had a rough day or b) are not ready for the level. We just don't see many scores at the mathematical extremes, so according to math, our data is very heavy in the middle. Our curve would be very tall around 60%, with short tails on both sides (meaning, there would be very few scores at any extreme which causes our "bell curve" to taper off to the x-axis very quickly).

Maybe if we cut off the tails, the normal distribution would be a good fit for the very middle of our data- maybe from 0.25 to 0.75 on the graph above since that section of the graph follows a fairly straight line.

Anyway, this firms up my to belief that the judge at the 3/6 show was a little too happy to award low scores. Terrible happened to too many people without the other extreme happening (an 80 for every 40).

One last mathy concept for the series- I wanted to compare Penn's performance to all of the other scores at each show:

Quartile charts! At 4 of the 6 shows, all of his scores were in the top 25% of the awarded scores, with two scores being the high score of the day (11/29/2015 and 4/3/2016). The 11/1/2015 show was with only 3 months of consistent riding and training and very little show experience. And we all know that I majorly screwed us up at the 3/6/2016 show.

Anyway, that's enough math about the series! Let's look at Penn's math!

I made up a Centerline Scores equivalent in Excel that has Lifetime, Schooling Show, and Recognized Show score breakdowns. Schooling Shows tend to be more generous, so I wanted to be able to see all three breakdowns (I can later evaluate how different they really are :-p). Right now, we only have schooling data for Penn, so here we go, 7 schooling shows worth of data:

Some good numbers there. He'll never ride Intro or Training Level again (with me anyway), so those stats will always look like that.

Average scores for each movement across all of Penn's training level tests.

Alright, discussion time.

  • The walk work is one of Penn's strong points. Hello 17-test free walk average of 6.91, I'm looking at you right now.
  • Trot is good for Penn, with the stretchy trot getting an average of 6.68 in 11 tests.
  • The canter to the right has been Penn's better side, as evidence by the canter averages.
  • The left lead canter work was severely brought down by the Training 1 left lead canter work. That was always a horrible part of the test, and the canter down the long wall after the circle was always wavy and awful. Training 1's left lead canter circle (with transition) averaged a 5.5 and the working canter down the long wall after averaged a 5.83 in the 6 tests we did. Austen, I believe you asked why Training 1 seemed to be our worst test- I think this is part of the answer! Nervous Penn doesn't canter well, and this is the first canter of the day!
  • The trot one loops were my fault, not Penn's. The weird arena size made them very difficult to ride accurately, and then the judges seemed to want a change of bend in them as well. I couldn't deliver both at the same time!

Overall, Penn is a consistent training level horse at this time. Hoping to make that first level soon, I just sent in an entry for 1-2 and 1-3!

I hope you all enjoyed my mathing. I sure did, and I've had enough for a while :-)


  1. I love all this analysis - but I particularly /love/ putting together an average for each movement. What a great thought to analyze things to practice before a show, or even to identify, as you have, strengths and weaknesses.

    1. The analysis really pointed out how weak our left lead is. I knew it was weak, I just didn't realize how weak! I'll be sure to work on it before our first level debut.

  2. Yay math! I'm almost inspired enough to do this myself. Almost, but not quite... ;)

    1. I find it oddly satisfying. It won't be the last time I do stats! You should do it too Hehe!

  3. This is so awesome! I want to show so I can have fun stats!!!

    1. Thanks! You'll get your boy there and then you can do all the stats you want :-)

  4. Replies
    1. Not sure what is wrong with my phone but I can't get the comment box thingy to work on your TBT post with the helmet cam, but just wanted to say I <3 it!!

  5. Replies
    1. Haha! I'm just making use of the fact that this series posts all of their results online so I can mess with it later. I found it a fun refresher of my stat minor!

  6. Whoa so much maths. I've contemplated doing something similar, but our scores for each movement vary wildly. Some things are consistent, like our medium trot always blows, but that's kind of easy to get. I love seeing it broken down here, though!

    1. I imagine as we move up the levels it'll get less consistent. I liked seeing it broken down too- it really pointed out how bad the left canter is!