stack twitter tryhackme rss linkedin cross

Wilco van Esch

Skip to main content

Search results

    What are the differences between load, stress, spike, and soak testing?

    What do you want to learn about the application?

    Each of these questions has a performance test type associated with it:

    • What is the maximum load the application can handle? ➡ stress test
    • What are the performance bottlenecks? ➡ load test
    • How does the application respond to a sudden peak? ➡ spike test
    • Does performance degrade under continuous usage? ➡ soak test

    These terms are not used consistently from one company to the next. It's common to do a shorter duration soak test and refer to it as stress testing or load testing interchangeably. You're more than welcome to treat the below explanations as the only correct way to refer to these types of performance testing.

    Test types

    All these tests have in common that you'll need identify common user flows and simulate them through test scripts.

    Stress test

    In a stress test you don't just burden the application with a high load, you drive the application to failure.

    • Decide how you will measure when the application has reached its limit, for example: >= 5% of HTTP requests fail
    • Ramp up the number of concurrent users until you reach that failure threshold
    • Note the number of concurrent users at that point

    Now you know what load the application can handle before things start going terribly wrong.

    Load test

    Example approach:

    • Calculate (see the formula further down the page) the number of concurrent users / virtual users
    • Set ramping to gently go up to concurrent normal usage and gently down again after 1 hour
    • Monitor throughput during the test to identify any degradation
    • Head to observability tools to determine the cause of bottlenecks (repeated transactions? slow SQL queries?)
    • Run the test again, but this time configure it for concurrent peak usage

    Now you're aware of specific bottlenecks you can resolve.

    Spike test

    Example approach:

    • Base the spike test configuration on the load test scenario configured for normal usage
    • Edit it so that after 10 minutes of testing it will go to peak usage with a 5 minute ramp-up, duration, and ramp-down
    • Edit it so that after 30 minutes, it will go to peak usage with a 2 minute ramp-up, 5 minute duration, and 2 minute ramp-down
    • Edit it so that after 45 minutes, it will go to peak usage with a 1 minute ramp-up, 5 minute duration, and 1 minute ramp-down
    • Start the test and monitor error rates, response times, and throughput during those spikes
    • Head to observability tools to determine the cause of hiccups

    Now you know how well the application handles sudden spikes in activity.

    Soak test

    Example approach:

    • Base the soak test configuration on the load test scenario configured for normal usage
    • Extend the test duration to 12 hours, and add in a realistic day/night gentle ramping up and down
    • When the test has concluded, determine whether any errors or bottlenecks appear later in the test which didn’t occur in the other tests

    Now you know whether there are any performance issues which you'd otherwise only discover once a release has been in production for a day.

    Concurrent usage calculation

    People often overestimate the number of concurrent users they need, because they don't take into account that if you have thousands of users per hour, they don't all spend that full hour actively operating the application.

    This is what you need from your user analytics:

    • For your most representative usage level and peak usage level, what were the number of unique sessions per hour?
    • Potentially add another level for an optimistic upcoming peak usage level, such as twice last year's peak usage.
    • The average session duration for the same time period you got the unique sessions per hour from.

    The formula:

    Concurrent users = (Unique Sessions Per Hour * Average Session Duration in seconds) / 3600

    I was taught that multiplication has priority over division, but I've added parenthesis for clarity.

    Conclusion

    You don't necessarily have to use all of these test types. Simply said they can teach you:

    • The number of concurrent users at which you expect HTTP requests to start failing for 5% or more of your users
    • Hardware and software bottlenecks which keep you from performing better at normal and peak usage
    • How well you’re prepared for sudden peaks in usage (e.g. Black Friday)
    • What kind of issues you could expect to pop up when the excitement of going live dies down after a few hours

    Prerequisites

    • Have a production-like environment that includes all the expected infrastructure
      • Should take into account the potential cost of any quotas you might exceed
      • Your infrastructure & InfoSec teams should be warned in advance of the tests (especially stress tests)
      • For stress testing, you (or the responsible team) should be ready to restart a failing environment
      • The tests shouldn’t be run when anyone else is using the environment
    • Have a tool which allows you to configure load scenarios AND drive virtual users (e.g. k6)
    • Design and configure the load scenarios well ahead of time and (safely) test the configuration
    • Have the load scenarios reflect user flow(s) through the application which you expect to be the most common
      • This should include (semi-random) waiting times to reflect real user behaviour, e.g. sleep(math.random(2,4))
      • You should be realistic (real flows, realistic behaviour), but not obsessive (all flows, detailed user personas)
    • Have the ability to monitor (and review logs of) hardware and software metrics
      • The performance testing tool used will typically already report the response times, throughput, etc.
      • However, you will also want to know what happens to the server (CPU usage, RAM usage, disk space)
      • You will also want to understand response times at a deeper level (what’s slow: db queries? frontend rendering?)