Table of Contents
So… a lot of rating systems kinda suck.
Yes I know that it’s rich coming from me, the guy who practically made his living through glorified tier lists. But here I’d argue that, just like putting multiple drivers in an IEM, it’s problem isn’t the concept but rather the execution. Skews, biases, general disregard for scaling, all of which results in practically unusable distribution curves and so unreliable ratings
Disagreements about individual reviews aside, I’m not here to bash differing opinions. This is an article purely on the execution of rating and ranking systems of various websites, though of course also critiquing the overly-positive vibe of many publications that may be leading our hobby to its downfall if not corrected.
For a full breakdown of the data used in this analysis, refer to this spreadsheet.
WhatHiFi is possibly the largest audio-focused review site out there today, and their influence cannot be underestimated. While we enthusiasts may shun them and laugh at their articles more than take them seriously, it’s no secret that mainstream media basically treats their word as gospel.
But disregarding personal opinions on the site itself, there is one thing we can use to objectively judge their own judging system: math. WhatHiFi uses a fairly common 5-star rating system, though unlike most others using a star-based system WhatHiFi does not use half-stars. 1, 2, 3, 4, or 5, nothing in-between.
Many non-enthusiasts consider a 5-star rating from WhatHiFi (or even the 4-star) the ultimate endorsement, perhaps the ultimate proof that a product is truly worth every cent the retailer asks for. But how special is it exactly?
The data collected is based on WhatHiFi’s headphone and earphone reviews from October 3rd, 2019 till today, for a total of 90 data samples.
If your product was awarded at least a “coveted” 4-star, congratulations! It means basically nothing.
Even the 5-star award doesn’t mean much considering that it represents the top 35% of awards, which could mean anything between “best of the best” and “above average”. To put it all in context, even the 3-star award (which would be a 60/100 on a standard century scale) is a rarer award compared to the 4 or 5-star at only 24% of the total awardees, while the 4 and 5-star awardees combined make up a whopping 74.44%.
Basically, with WhatHiFi you have to really read between the lines. Statistically speaking, a 4-star WhatHiFi award is average. You’d have a better time ignoring everything that they didn’t award 5-stars because 4-and-below essentially represent mediocrity, and I’m sure the average consumer wouldn’t want that.
So looking at WhatHiFi’s rating system from a more abstract point of view, we see the following:
- 2 out of 5 possible ratings effectively unused (1-star and 2-star)
- A heavily-positive skew in rating distribution
- 74.44% of awardees occupying the top two possible ratings (4-star and 5-star)
Mathematically speaking, this is a garbage performance scale. I wish my schools graded me like this.
Probably the runner-up for the title of “most popular audio-centric review site”, SoundGuys is another that holds massive sway over the mainstream audio market. They’re arguably better than WhatHiFi too; for one thing they make use of and publish their own frequency response measurements, which puts them above the pack of many audiophile-oriented review sites in my books.
That said, their measurements are pretty inconsistent and
they still refuse to disclose their rig they’ve now disclosed that they use a B&K 5128 for measurements, though all that’s a separate topic altogether.
Now in terms of their rating system, SoundGuys does away with the (frankly antiquated) 5-star system and instead goes for the out-of-10 system, rounded to one decimal point. Now this gives a lot more flexibility and leeway to properly set defined boundaries and “leagues” between performance levels, assuming that one uses the entire scale.
ASSUMING THAT ONE USES THE ENTIRE SCALE.
The data collected is based on SoundGuy’s most recent headphone and earphone reviews, up to 150 data samples.
Here we have a prime example of scale inefficiency. A full 10-point scale, yet a vast majority of entries distributed between 7 and 8.5. Absolute madness.
Quick question: out of 10, what would you reasonably expect “average” to sit around? A 5? Maybe a 6?
How about 7.5.
Not even that, a score of 7.2 would put a product at the top 75%. Two-thirds of what SoundGuys have reviewed recently have been awarded at least a 7.2, which under a reasonable performance scale would be fairly excellent. Hell, me in university would’ve killed for a 72% average.
This also compounded with the fact that SoundGuys seem to be deathly afraid of awarding anything above a 9.0, which could give them the appearance of having higher standards but also doesn’t do much when they’re also deathly afraid of awarding anything below a 7.0. Effectively, the SoundGuys rating system is extremely constrained to the point where a single .1 rating difference represents far too huge a shift in actual performance.
So, while I prefer the use of an out-of-10 system over the 5-star rating system, at least WhatHiFi is willing to use 60% of their scale. SoundGuys on the other hand, barely 20.
While the other two sites are fairly well-respected by the mainstream but not so much in the audiophile community, Headfonics is one that is backlinked quite a bit by numerous audiophile forums like Head-Fi, even being the sponsor of many CanJams. Like SoundGuys, they use an out-of-10 rating system.
So it only stands to reason that one of the audiophile review sites would have a rating system that isn’t completely skewed and biased, and shows that audiophile reviewers have far higher standards than our mainstream counterparts… right?
The data collected is based on Headfonic’s most recent headphone and earphone reviews, up to 75 data samples for each type for a total of 150 data samples.
If you thought SoundGuys was bad, you haven’t seen anything yet.
What kind of screwed-up performance scale has the average set at 8.33/10? I’m fine with the top 10% being at 9 and above, but not when the top 75% is… 8/10?!
I’m honestly ashamed that an audiophile review site would have a rating system that’s far more skewed and biased that anything I’ve seen from mainstream media. Basically nothing is under 6/10? Really? We’re supposed to have higher standards as enthusiasts, not lower!
Look, the data speaks for itself. If everything is “good”, nothing is.
Headphonesty’s a bit of a mixed bag when it comes to the editorial team. On one hand, you have extremely knowledgable contributors who go out of their way to learn more about the hobby as a whole, and as such put out some of the highest quality articles I’ve read in the portable-audio industry.
On the other hand… well, again, topic for another time. Regardless, Headphonesty is undoubtedly one of the more popular of the audiophile sites with the traffic to back it up, but as we’ve established popularity is no measure of competence. Like WhatHiFi, Headphonesty also uses a star rating system though allows for the use of the half-star, thereby increasing the number of ratings that can be awarded from a paltry 5 to a comfortable 10.
So they’ve used a slightly more flexible rating system. But have they really made use of it?
The data collected is based on Headphonesty’s most recent headphone and earphone reviews (only those using the star rating system), up to 150 data samples.
Well… it’s not great, but after the previous trainwrecks this looks reasonable by comparison. Don’t get wrong, a median of 4 on what is basically an out-of-5 rating system is still indicative of some major skews and biases behind the scenes, but at the very least it seems that the reviewers at Headphonesty are willing to go below a 3-star rating if need be.
Next to WhatHiFi’s 4-star-being-top-75%, Headphonesty has the top 75% at… 3.5 stars. An improvement, but in a vacuum this is still a terrible scaling system. C’mon guys, be a little less trigger-happy with the 4s and 5s. Not everything can be great.
MajorHiFi is a review website owned by Audio46, which is an audio store based in New York City. MajorHiFi isn’t that popular especially relative to the other four (hell, relative to In-Ear Fidelity too) but I’m adding them into this analysis because they’ve made ranking lists of headphones and earphones like a certain someone.
Really MajorHiFi? Even calling them “Ranking Lists”? At least use a different description. Tierlists are also a thing.
The data collected here is everything that was displayed on MajorHiFi’s respective ranking lists.
|IEM Rank||Count||Top %|
|Headphone Rank||Count||Top %|
Per usual, the data speaks for itself. The median for both ranking lists is “B”, which is the second highest grade on the MajorHiFi scale. Getting either of the two grades (A or B) puts you at the top 76% and top 62% of the IEM and headphone ranking lists respectively, which can mean anything from “best of the best” to “below average”.
On another note, assuming that the “E” grade isn’t applicable here (it’s not being used after all, can’t blame me for the assumption), this A-B-C-D-F grading system is no different from a 5-star rating system. Which brings into question why even use it over the more common latter in the first place…
Look, if you’re going to rip off the audio ranking list concept (even down to separating it by IEMs and headphones and adding a separate value rating)… at least make it normally distributed too.
You gotta read between the lines.
WhatHiFi: Statistically, only 5-star awards are worthy of consideration.
SoundGuys: A score of 7.5/10 is statistically average, and only those rated 8/10 are truly significantly ahead of the pack (top 25%).
Headfonics: Virtually nothing is rated under 6/10, and the average is set at roughly 8.4/10. The worst of the bunch in terms of rating skew and bias.
Headphonesty: The best of the bunch, but still terrible. 4/5 stars is the average rating, so really only those rated at 4.5 stars and above are worthy of consideration.
MajorHiFi: Distribution puts a “B” grade at the median, with the highest grade being “A”. Statistically speaking, only A-rated models are worthy of consideration.
And for those interested in the distribution and statistics of my own ranking system…
The long awaited update where nearly 400 new IEMs get ranked, bringing the total to 886 entries.
The long awaited update where 46 new headphones get ranked, along with a big overhaul of existing entries.