Is Our Judging System Broken?

I am sure we all have an opinion about a dressage test we performed and a score we did not believe we deserved. This statement is not reserved for just a poor score but also a good score. More now than ever, I have been scratching my head over the dressage scoring at Horse Trials.

At the risk of being called a poor sport, I began to research the judging system in the United States and found some interesting findings.

First, I was delighted to find an article submitted to the Journal of Quantitative Analysis in Sports – Scoring Variables and Judges Bias in Untied States Dressage Competitions. The article, submitted and published in 2010 was a statistical look at over 45,000 dressage tests scored in a several month period across the country in 2009. The findings confirmed my feeling of what I see happening in the dressage portion of our sport. A point I will bring to light.

Rick Dressage

If we all think about subjective judging, we must remember we are riding a dressage test which should be measured by a given standard. It would not be the same as riding in a flat class in which a judge uses their subjective process to find the “best” horse for that given flat class and rank the placings accordingly. In dressage,

each movement is compared against an accepted standard for that movement and is assigned a score from zero (movement not executed) to ten (excellent) by the officiating judge or judging panel. While riders compete against one another for class rankings, they ultimately compete against a standard to earn a final score.

I believe we have a broken judging system that is beginning to drastically affect our sport. This belief is anchored in the continual findings at horse trials where we have a judging panel and the scores between the two judges are drastically different. It is also confirmed, in foundation by the statistical findings of the “Scoring Variables” article.

My belief is based on the judging issue that occurred in the sport of ice skating about seven years ago. A broken judging system which was based on how they scored a performance, gave way to a new scoring system. I do not believe, nor am I advocating for a new scoring system. What I do believe needs to happen, is a re-focus on the judging system itself. We must begin to understand and be able to re-train our judges on the “standards” they are judging. Standards that are understood and implemented the same by all judges.

The article that I found looks at the subjectively judged sport of dressage with the purpose to identify key factors which influence the final score. It explores the theory that some judges score statistically higher or lower than their peers. Special attention was given to judges who use bias to score a test. While this article did not address issues of who is riding or what horse may be in the ring, it does give way to specific breeds being used as a bias. Based on my own experiences over the last two years, I am finding more and more that the basic method of judging by standards are not being applied by most of our judges.

In looking at the statistical findings of the scoring in dressage, it was determined that the scores tend to remain in the 4 to 7 range with extremes in judging being determined on either side of that spectrum. Here is where our standards in judging is broken. The following is a break down of the standard scoring method:

0 Not Executed
1 Very Bad
2 Bad
3 Fairly Bad
4 Insufficient
5 Marginal
6 Satisfactory
7 Fairly Good
8 Good
9 Very Good
10 Excellent

So what determines the “standards” to achieve the scores above and how are the judges and the industry defining these? Here is an example of what I found in the dictionary for the words listed above. Do we have a standard that is defined and understood by all of our judges? Do they ensure that the score given falls into the standard?

0 Not Executed Not performed or demonstrated
1 Very Bad Extremely Naughty or disobedient
2 Bad Naughty or disobedient
3 Fairly Bad A demonstration of Naughty or disobedient
4 Insufficient Not able to fulfill a need or requirement
5 Marginal Just barely adequate or within a lower limit
6 Satisfactory Having desirable or positive especially with those items specified
7 Fairly Good As deserved: justly
8 Good Suitable or efficient for a purpose
9 Very Good In a high degree; extremely
10 Excellent Of the highest or finest quality; exceptionally good of its kind

My question to all you reading this is to understand what is meant by standards and how the scoring should be effectively administered. Are we receiving the score my which the standard has been provided and the movement has been executed? Yes, a subjective part of this comes into play by individuals, but we must stick with what we know and understand how the movement was executed and thus scored accordingly.

In a Reiner Kimke Clinic in the late 90’s, Mr. Klimke drove home the issue of judges not scoring correctly with the lower and higher scores. He stated that they marginalized the scores into the middle range and did not supply the score which was deserved for a score of 9 or 10 or a 1 – 3. This made a lasting impression on me and to this day, you never see a 10 and 9’s are rare. On the opposite side of this is the scores of 1- 3. Are these scores administered and given correctly?

These are the questions that must be answered and I believe we need to study this further to understand how to fix our broken system. I will be writing more about this issue and encourage you to respond to this discussion. I am closing with a direct finding from the research done in the Scoring Variables article. This is fully supported by me:

The results presented in this paper show there are statistically significantly differences in scores between groups of judges. The same horse and rider pair can potentially receive different scores for the same performance when ridden in front of different judges. Competitors have been known to choose specific shows featuring specific judges that favor their riding or their horse. In such cases, competitors are not obtaining truly accurate scores and horse/rider combinations are not being scored to the same standard. In order to address this variability, judge education programs should increase emphasis on how to evaluate exceptionally well-performed and exceptionally poorly-performed movements.

Diaz, Ana E.; Johnston, Mary S.; Lucitti, Jennifer; Neckameyer, Wendi S.; and Moran, Katy M. (2010) “Scoring Variables and Judge Bias in United States Dressage Competitions,” Journal of Quantitative Analysis in Sports: Vol. 6: Iss. 3, Article 13.