What are the differences between load, stress, spike, and soak testing?

Jan 23, 2023

What do you want to learn about the application?

Each of these questions has a performance test type associated with it:

What is the maximum load the application can handle? ➡ stress test
What are the performance bottlenecks? ➡ load test
How does the application respond to a sudden peak? ➡ spike test
Does performance degrade under continuous usage? ➡ soak test

These terms are not used consistently from one company to the next. You might have someone do a shorter duration soak test and refer to it as stress testing or load testing interchangeably. You're more than welcome to treat the below explanations as the only correct way to refer to these types of performance testing. 😊

Test types

All these tests have in common that you'll need to identify common user flows and simulate them through test scripts.

Stress test

In a stress test you don't just burden the application with a high load, you drive the application to failure.

Decide how you will measure when the application has reached its limit, for example: >= 5% of HTTP requests fail
Ramp up the number of concurrent users until you reach that failure threshold
Note the number of concurrent users at that point

Now you know what load the application can handle before things start going terribly wrong.

Load test

Example approach:

Calculate (see the formula further down the page) the number of concurrent users / virtual users
Set ramping to gently go up to concurrent normal usage and gently down again after 1 hour
Monitor throughput during the test to identify any degradation
Head to observability tools to determine the cause of bottlenecks (repeated transactions? slow SQL queries?)
Run the test again, but this time configure it for concurrent peak usage

Now you're aware of specific bottlenecks you can resolve.

Spike test

Example approach:

Base the spike test configuration on the load test scenario configured for normal usage
Edit it so that after 10 minutes of testing it will go to peak usage with a 5 minute ramp-up, duration, and ramp-down
Edit it so that after 30 minutes, it will go to peak usage with a 2 minute ramp-up, 5 minute duration, and 2 minute ramp-down
Edit it so that after 45 minutes, it will go to peak usage with a 1 minute ramp-up, 5 minute duration, and 1 minute ramp-down
Start the test and monitor error rates, response times, and throughput during those spikes
Head to observability tools to determine the cause of hiccups

Now you know how well the application handles sudden spikes in activity.

Soak test

Example approach:

Base the soak test configuration on the load test scenario configured for normal usage
Extend the test duration to 12 hours, and add in a realistic day/night gentle ramping up and down
When the test has concluded, determine whether any errors or bottlenecks appear later in the test which didn’t occur in the other tests

Now you know whether there are any performance issues which you'd otherwise only discover once a release has been in production for a day.

Concurrent usage calculation

People often overestimate the number of concurrent users they need, because they don't take into account that if you have thousands of users per hour, they don't all spend that full hour actively operating the application.

This is what you need from your user analytics:

For your most representative usage level and peak usage level, what were the number of unique sessions per hour?
Potentially add another level for an optimistic upcoming peak usage level, such as twice last year's peak usage.
The average session duration for the same time period you got the unique sessions per hour from.

The formula:

Concurrent users = (Unique Sessions Per Hour * Average Session Duration in seconds) / 3600

I was taught that multiplication has priority over division, but I've added parenthesis for clarity.

Conclusion

You don't necessarily have to use all of these test types. Simply said they can teach you:

The number of concurrent users at which you expect HTTP requests to start failing for 5% or more of your users
Hardware and software bottlenecks which keep you from performing better at normal and peak usage
How well you’re prepared for sudden peaks in usage (e.g. Black Friday)
What kind of issues you could expect to pop up when the excitement of going live dies down after a few hours

Prerequisites

Have a production-like environment that includes all the expected infrastructure
- Should take into account the potential cost of any quotas you might exceed
- Your infrastructure & InfoSec teams should be warned in advance of the tests (especially stress tests)
- For stress testing, you (or the responsible team) should be ready to restart a failing environment
- The tests shouldn’t be run when anyone else is using the environment
Have a tool which allows you to configure load scenarios AND drive virtual users (e.g. k6)
Design and configure the load scenarios well ahead of time and (safely) test the configuration
Have the load scenarios reflect user flow(s) through the application which you expect to be the most common
- This should include (semi-random) waiting times to reflect real user behaviour, e.g. sleep(math.random(2,4))
- You should be realistic (real flows, realistic behaviour), but not obsessive (all flows, detailed user personas)
Have the ability to monitor (and review logs of) hardware and software metrics
- The performance testing tool used will typically already report the response times, throughput, etc.
- However, you will also want to know what happens to the server (CPU usage, RAM usage, disk space)
- You will also want to understand response times at a deeper level (what’s slow: db queries? frontend rendering?)

Wilco van Esch

Search results