Examining behaviour of the system under sustained typical or high load.
Determining the point at which the application (or a % of requests) fails.
Deciding the environment
Never use: any environment whose owner is not expecting the extra load.
May use: a staging environment matching the production environment sufficiently for your purposes, which may include auto-balancing and auto-scaling configurations.
Ideally use: the actual production environment at a time of consistent low usage.
Deciding concurrent usage
Sessions Per Hour x Average Visit Duration (seconds) / 3600.
Make this calculation for normal load (your most typical business day) and peak load (highest traffic in the last year, but add a margin on top depending on what you're expecting to gain from any extra spend on marketing and paid search).
You don't need to know concurrent usage. You will find out the maximum concurrent usage your application and its infrastructure can handle.
Deciding what to test
If you already have baseline results from a previous test which you want to compare against for new changes to the application or infrastructure, use the exact same test scenarios.
If not, either analyse the user flows of actual traffic for a non-typical event (for example previous year's Black Friday traffic for preparing for this year's Black Friday) or analyse user flows for the most crucial journeys through the application.
In the test scenarios (for example defined in JS for k6) define page views and events for the user flows with realistic wait times in between. You could make these random within a range, to reflect the variance of real users.
You'll have to make a judgment call for how closely you'll make the user flows and data reflect the variation and user quirks of your actual traffic, by considering how much that extra realism is worth in comparison to the time spent defining it and testing it does what it should.
Deciding how to test
In the tool you're using to drive traffic, add up and down ramping which the application can easily handle and spawn virtual users to the concurrent usage results you got in "Deciding concurrent usage" and have those virtual users execute the journeys you defined in "Deciding what to test". Keep this running for an hour (or longer if you'd like to make this a soak or endurance test).
Review relevant metrics which were gathered during the test and look for bottlenecks. What resources were under most stress? Disk, CPU, RAM, the network?
In the tool you're using to drive traffic, configure a ramping up which will not end. Decide what you will consider the breaking point of your application. For example, 10% of HTTP requests fail with a 500 error. Then either automatically (if the tool allows it) or manually stop the stress test when you reach that point.
Mark the number of concurrent virtual users at this point, and review the most burdened resources (for example, CPU usage) and significant metrics (for example, throughput).