On 6-Point Grading Scale

During my 12–year journey through formal education I became very familiar with 6–point grading scale. It’s one of those things you are exposed to so early in life, that you accept it as a truth about the world. And for me, empirically, it made sense. I could always say if something was excellent, very good, good, acceptable, bad, or tragic.

My worldview shifted a little when I went to a university; higher education institutions in my country use 5–point grading scale. I remember one lecture in one of my B.Sc. classes during which the professor said: People can grade with a scale up to 6 points of precision. Anything more is arbitrary. This offsite comment stuck with me for over 5 years. Now I’ve decided to dig a little.

What is a grade?

Grades are a way for teachers to report student performance. Most simply, they indicate if a student scored below the threshold (i.e. failed) or above the threshold (i.e. passed). Most grading systems split the latter into additional scales reporting the achieved level of course mastery, like: borderline, passing, passing (top).

Grading scales used in education vary greatly between countries. Figure below shows a distribution of grading scales used in schools below university level in various countries.¹

Distribution of grading scales in countries
12-pt	1
10-pt	13
7-pt	4
6-pt	23
5-pt	29
4-pt	6

While most countries use either 5–point or 6–point scale, there are some which use 10–point, or even 12–point scale. 10–point scale seems like a reasonable choice. We have 10 fingers in our hands and the base 10 number system is the most popular in daily use.² Also, 10–point scale can trivially expand into 100–point scale, which is percentage. It makes a lot of sense to represents test scores as a percentage, as percentage is data normalized to 0–100 range. If a test is long, representing grade as a score over maximum score may not be clear. Say, $\frac{53}{71}$. Is it good? If so, how good is it? Now, let’s consider 75%. It spells out so much more, because we’re used to the fact that 100% is maximum.

What does 75% say about the level of achievement? In this case it’s easy; the student commands $\frac{3}{4}$ of the material very well. Not everything has been mastered (relative to, say, 95%) but, definitely, more than half of learning has been done. Yet, what’s the difference between 63% and 64%? Is the latter better? Probably. If so, how exactly is it better? Percentage scale is too wide for humans to understand clearly. It introduces ambiguities which can be exploited: if the level of achievement for certain percentages is vague, students might be tempted to discuss their grade with the teacher to eke out a few percentage points.³

Also, as noted in Guideline for choosing a grading scale, use of a multi-level scale increases the risk that assessments will focus on things that are easy to measure fairly rather than on meaningful tasks. Teachers might be tempted to choose an assessment that will fully reflect on a wide grading scale, making measurement of student achievement secondary.

Another thing to consider is failing grades. Many education systems⁴ use multi-level scale for failing grades but, usually, they do not provide distinction between the levels. Let’s say we’re considering 10-point grading scale and grades below 5 are failing grades. What’s the difference between 1 and 4? In both cases in means that the student did not achieve the required level of course mastery. But does a student with 1 achieved even less than a student with 4? Both of them still failed — and, probably, will have to retake the examination — but the student with lower grade is punished more strongly.

With multiple failing grades, mathematically, failing a class is more likely than passing a class. If the final grade is a grade point average (GPA), then scoring 1, rather than 4, impacts the final grade. Even if both students retake the exam and get 8, the student who initially scored 1 will get 4.5 (let’s be generous and say 5), while the student with 4 gets 6 as their GPA.⁵ We don’t need multiple failing grades, as they only penalize students who started the course poorly. This is why many countries, which were using 10-point grading scale with multiple failing grades in the past have discarded lower grades, leaving only one failing grade in the scale.

Having a wide spectrum of passing grades also creates a side effect. Due to stress of performing good in school, students take on trying to chase better grades. When LLMs (especially ChatGPT) became popular, students have started to use these tools to cheat on exams, or decrease homework load by querying LLMs to generate ready-to-hand-in essays. Rather than thinking about how to prevent students from cheating, we should be asking why they are cheating in the first place. Chasing the good grades in a part of what why.⁶

Conclusions

Grading scales vary greatly between countries and between education levels. Some countries that were using 10-point grading scale historically, now moved to 6-point or even 5-point grading scales, by discarding bottom failing grades. Wide grading scales (especially 10-point) make assessment design easier, but they introduce a risk of focusing more on the examination method itself, rather than on measurements it provides.

For someone not actively involved in education my recommendation — my major was not education, so take it with a grain of salt, rather than a godsend truth — after researching this topic a bit is to use 10-point grading scale for automated processes, and to use 6-point grading scale for anything you’d like to categorize to decrease mental overhead. If I ask you what do you think about a book you’ve last read, you can probably tell if it was excellent, very good, good, acceptable, bad, or tragic.

PS

Today is September 1^st and many student across the world are just starting another school year. Allow me to say to all of you: godspeed! Focus on nurturing yourself into a self-sufficient adult. Grades are signals for you to measure and iterate on your performance, they are not the primary goal of education. May this new school year be extraordinary, whatever that means for you.

Kamil

Source for the chart is Grading systems by country on Wikipedia. The articles does not list all countries and several that are listed have inconsistencies in their grading systems, which makes it difficult to fold them into a single category. ↩︎
This sentence might sound like tomayto, tomahto. One of the hypotheses on why we use base 10 is that we have 10 fingers. However, it is unproven; it might be that the civilization using base 10 have used a bigger army diplomacy and conquered others. ↩︎
This problem has been considered by the Uppsala University in Sweden when they were designing their Guideline for choosing a grading scale. See also Schinske, J. & Tanner, K. Teaching more by grading less (or differently). CBE - Life Sciences Education, 13(2), 159-166 on lifescied.org. ↩︎
Notable examples being: Brazil, Denmark, France, Germany, Italy, Latvia, Lithuania, Spain, The Netherlands. See Grading systems by country. ↩︎
Another thing to consider with multiple failing grades is that GPA does not always guarantee a passing grade after an exam retake. Say, a person scored 1 and then 7. GPA from these two is 4, which is still a failing grade. ↩︎
Research done by Denise Pope, Ph.D., a co-founder of Challenge Success, and published in February edition of The Hechinger Report. ↩︎

Conclusions

PS

Interested in my work?

Cookies 🍪