The grading dilemma: a typical problem when comparing performance

Evaluating the performance of your own website is very easy these days. One or two clicks and Google or another service spits out results with concrete suggestions for solutions. Wonderful. At least for the first optimisation run. But when it comes to fine-tuning, changing host or cleaning out WordPress, it becomes important to understand which tools actually measure the loading time and how to handle this data.

A few years ago, a customer wrote to us via the support chat. He had just moved and compared the performance of his website with his old hoster with his website at Raidboxes. He told us that the move was not really worth it for a performance increase of just 9 points in Google PageSpeed Insights.

In fact, we get requests like this all the time. That’s why I took a look at what information tools like Google PageSpeed Insights actually provide for interpretation and how they measure performance and loading time. To be honest, the results surprised me a little. Because: the meaning of the values is usually explained very well and in detail. However, the help pages of the test providers do not go into detail on two points:

Which tool is suitable for which purpose?
Which data can be interpreted and used and how?

Tools like Google PageSpeed Insights do not measure the speed of your website

It was already discussed in another blog post: tests such as Google PageSpeed Insights do not measure the loading time of your website, but rather its optimisation potential. They therefore determine how well your website fulfils a predefined set of performance-relevant criteria. The tests also provide instructions for optimising the performance potential. However, there is one thing such tests explicitly do not do: measure the loading time.

On Google it sounds like this:

PageSpeed Insights measures opportunities to increase the performance of a page in the following ways:

Time required to load the content visible without scrolling: Time from requesting a new page until the browser renders the content visible without scrolling

Time required to fully load the page: Time from requesting a new page until the page is fully rendered by the browser

As you can see, Google does not measure speed, but the “possibilities for increasing performance”. A crucial difference. And this also means that you can’t tell from the results how fast the page or the area visible without scrolling actually loads.

Performance tools such as PageSpeed Insights primarily show where you can quickly gain a lot of performance

However, this is not a problem, as the tools still provide valuable data for optimisation, even if they do not measure the loading time. The statements of such tests have the greatest added value for major optimisation steps, such as the use of caching or image compression.

However, as soon as it comes to optimising the loading time of an already optimised website, these tests can only provide limited insights. In this case, you need to carry out a real performance measurement. This is especially true when changing hosting providers. Because no matter how good the web server itself is, if the website is full of construction sites, a change of infrastructure is of relatively little use.

The following tools, for example, are available for such a “real” performance measurement:

With one of these tests, the customer would have been able to compare exactly where his site had improved performance after the change.

And that brings me to the second point of this post: Tools such as PageSpeed Insights in particular tempt you to use values for a comparison that are only suitable for this to a limited extent or not at all. Because when you work with scores or grading systems, you quickly end up in a situation that I refer to in this article as the school grading dilemma.

The school grades dilemma: grades are not suitable for comparisons

Tools such as Google PageSpeed Insights or Yahoo’s YSlow provide two types of data:

a grade for the performance of the website
concrete tips for improving this grade

The scores are on a scale from 0 to 100, with 100 being the best result. So far so clear. And intuitively accessible. Especially because the ratings are supported by a traffic light system.

But when it comes to comparing two sites based on these ratings, interpreting the measurement results is no longer so easy. In fact, it is incredibly difficult, if not impossible. Because everyone can recognise that the site with the 90 rating is better than the one with the 80 rating. However, the following statement can no longer be made: By what factor is the page with the 90 rating better than the other?

And that describes the core of the problem: Grading systems simply do not allow such statements. You know this from your school days: your bench neighbour got a C, but you got a B yourself. Even if you’re only one or two points apart: The result is fundamentally different. And without knowing the grade key of the paper, it’s impossible to say how close the result was.

This limited informative value is due to the so-called scale level of the measurement data. However, I do not want to go into this in detail here. For more information on scale levels and the permissible arithmetic operations, a look at Wikipedia will suffice.

Back to our example from the beginning: no one is able to say exactly by what factor the old and new sides differ. Such a statement is only possible with a real speed measurement.

Our WordPress speed test

Check your loading time and receive the result by email! Together we’ll find the best solution for you and your company. Or test our WordPress hosting directly for several days free of charge!

To the speed test

Time measurements provide the best loading time data

The most valuable data for comparisons, the preparation of optimisation measures etc. are in any case time measurements. This is because they have a zero point that can be used for orientation. This means that tools that measure the loading time allow all kinds of statements and comparisons to be made.

So if you measure a page load time of 2.712 seconds before an optimisation measure and a value of 2.133 seconds after the change, you can make the following statements based on this data:

The site is 21 per cent faster after the changeover than before the changeover
The optimised aspect is responsible for more than a fifth of page performance (one of the most important pieces of information of all!).
All other optimisation measures can be set in relation to this value. For example, an optimisation that would result in 9 percent more speed, but requires much more effort, can be prioritised differently than a measure that saves correspondingly more loading time.

If the customer in the example case had used a tool such as webpagetest.org from the outset, he would have been able to see that the performance of his site had more than doubled in the relevant areas.

Conclusion: Knowing the type and quality of the measurement data is just the beginning

For a meaningful comparison of two or more websites, at least the following two requirements must be met:

The tool used must measure the right things – i.e. the relevant parts of the website. When changing host, for example, you should not rely exclusively on a test that primarily looks at on-page factors.
The data used must allow a meaningful comparison to be made. Normally, you want to know by what factor an optimisation has brought your own website forward. Only with this information can you, for example, make a forecast about the improvement in the conversion rate.

Admittedly: Knowing the right data is just the beginning. Of course, you also need to know how to properly test page performance and read the data records. That’s why we deal with these two topics in detail in other blog posts.

However, understanding the data and the permissible conclusions that can be drawn from it is the basis for all further optimisation steps. And it helps to take the right and most sensible optimisation measures.

Subscribe to the Raidboxes newsletter!

We share the latest WordPress insights, business tips, and more with you once a month.

Jan Hornung

last updated 20.02.26

0 Comments

Jan Hornung

He has been part of Raidboxes since the very beginning and is head of support. At BarCamps and WordCamps, he loves to talk about page speed and website performance. The best way to bribe him is with an espresso – or Bavarian pretzels.

Advantages

Offers

Customer voices

Hosting

Product areas

Product add-ons

Business solutions

Solutions for

Agency solutions

Knowhow

Help

70% faster development and rapid performance