API Load Testing Tutorial & Best Practices

August 28, 2023

min

API load testing involves evaluating the performance and scalability of an API by simulating multiple concurrent users sending requests to the API server under various load conditions or simulating a target RPS (requests per second). API load testing can increase confidence in a system’s ability to perform the right operations under a heavy load with acceptable response times. It can also demonstrate that the system will continue functioning as expected under various workload patterns over extended periods.

This article explains popular API load-testing techniques and presents industry best practices that experienced test engineers use to ensure successful projects.

Common types of API load tests

First, let’s look at the various API load tests and the purpose they serve.

Type of Load Test	Description
Anticipated load tests	These tests verify system performance under anticipated loads and real-world operations.
Stress tests	Stress tests measure system performance under high loads and specific stress scenarios, permitting early detection of problems and improvement of overall system performance.
Capacity tests	Capacity tests push the system to its limits, identifying performance thresholds and prompting necessary actions to maintain acceptable operation.
Soak tests	Soak tests verify a system's performance under extended periods of varying load, monitoring response time trends, resource consumption, and connection limits.

Anticipated load tests

These tests verify that the system operates as expected under “normal” load and traffic patterns. They focus on multiple operations happening simultaneously, mimicking real-world patterns and quantities. For example, an anticipated load test of an e-commerce site might involve 100 users simultaneously accessing the site, performing operations such as login, authentication, search, and checkout.

While individual tests also validate specific operations like those above, an anticipated load test ensures that the system can execute these operations simultaneously and that it can handle a load higher than that of a single user.

Stress tests

These tests ensure that the system functions within acceptable limits when subjected to substantial loads relative to its total transaction processing capacity. Stress-testing scenarios may involve testing the entire system, such as simulating a surge in users during holiday traffic for an e-commerce site.

Alternatively, stress tests can focus on specific areas of the system, like evaluating its ability to quickly serve product pages after sending mass promotional emails. This type of testing allows for early measurement, verification, and improvement of the system's performance under significant loads before it occurs in the production environment.

Capacity tests

These tests aim to determine the limits beyond which the system no longer meets acceptable standards. Like stress tests, different system components have varying thresholds before experiencing service degradation, necessitating various capacity tests to pinpoint these limits.

Capacity testing is not solely about finding the point of complete system outage or failure. For example, in an e-commerce site, the system may still load product pages under significant load but so slowly that users would likely abandon their purchase attempts. The goal of capacity tests is to identify this threshold: where the system's performance is inadequate even though it technically remains operational. If such a load is anticipated or observed in production, the team will know that further action is needed to maintain the system within acceptable performance levels.

Soak tests

This type of API testing aims to verify that a system can operate under various load types for extended periods. In contrast with the anticipated load test described above, a soak test would focus on a system's ability to operate under that expected load for hours, days, weeks, or even longer. Running a soak test for weeks is rarely feasible, but running for shorter durations can provide visibility into key indicators of a system's performance over time.

These are a few of the sample questions a soak test attempts to answer:

What are the trendlines of the various response time metrics? Is the average response time increasing over time? Do more outliers appear over time?
What are memory and disk space consumption levels over time? Are system limits being approached?
Are database connection limits approached throughout the test?

Summary of API load testing best practices

Implementing the following best practices can enhance the development of API load tests and maximize the value derived from their execution. These practices encompass a range of API load testing aspects, facilitating test development, execution, and continuous analysis for further improvement.

Best Practice	Description
Ensure that each test has a specific purpose	Consider the goals of each specific test, and design it to accomplish only those goals. For example, a single test should avoid encompassing anticipated load, capacity, and stress testing.
Align the API requests with real user behavior	Users may have different objectives when accessing a system and perform various operations at different speeds when accomplishing those objectives. API tests aim to mimic these users in a representative way.
Exercise both the system as a whole and specific components	Craft load tests that focus on specific areas or components of the system, enabling the identification and isolation of bottlenecks and gaining insights into the system's performance under diverse scenarios and load patterns.
Don’t forget the time variable	Executing a test for an hour may provide dramatically different insights than running the same test for 10-15 minutes.
Leverage existing tools	Tools available on the market can assist with the many elements of API load testing, ranging from data curation to load generation to integration within a CI environment. The benefits outweigh any associated costs, and you should consider them before designing and implementing a home-grown solution.
Record, iterate, and improve over time	The relevance of results from a specific instance of an API load test diminishes with each new version developed and released. Software engineers can create trend graphs over weeks and months and gain insights to improve the application architecture by recording results and regularly re-testing the system.

API load testing best practices in detail

Ensure that each test has a specific purpose

A common shortcoming with API load tests is that they are too broad and lack a clear goal or intention.

API load tests should target different aspects of the system, such as simulating a particular user persona, monitoring system components like a database or a particular microservice, and addressing multiple test intentions mentioned earlier in this article.

Narrowly focused API load tests offer distinct advantages, as their results are more easily interpretable and applicable to real production scenarios. Concentrating on specific components or subsystems can more effectively isolate the root causes of test failures than system-wide tests, preventing production outages.

The key takeaway is that API load tests should isolate specific API requests and leverage automation to test the full API functionality with a portfolio of narrowly scoped tests rather than executing a small number of manual tests with a broad scope.

Align the API requests with real user behavior

API requests directly result from how users interact with the application, which is why API testing should be done in a way that is as close as possible to realistic user behavior. For example, an API load test might simulate several hundred e-commerce users searching for a product, adding products to their shopping carts, and then checking out. A test case simulating a real production environment for such a scenario should include multiple searches at various typing speeds, returning intermediate search results that mimic typical user behavior before selecting a product and checking out with their shopping carts.

Different personas interact with an application in different ways, generating API requests in particular sequences, exercising a variety of API endpoints with different payload sizes, and doing so at varying time intervals.

Test engineers must organize typical users into personas (e.g., by application experience, language preference, or geography) and model user journeys through the application paths to generate a sequence of API requests that best represent the actions of different types of users. Automated test procedures will leverage the resulting sequence of API requests to scale the test to hundreds of concurrent users of different types, simulating a realistic production workload.

Exercise both the system as a whole and specific components

Any system will fail or degrade significantly under some level of increased load. API load tests aim to identify these thresholds, but a challenging nuance of this process is that each component or subsystem will likely fail under different load levels. This variability makes it difficult to identify specific points of failure without targeting subsets of your API load tests at these particular components or subsystems.

A solution to this challenge is for engineers to construct granular API load tests with a narrow focus to test subsystems in isolation. Once test engineers have identified the breaking points for individual subsystems, they can better allocate capacity in a production environment.

As an illustration, let's consider a system-wide load test that reveals that the system can handle 500 concurrent users browsing product pages and performing checkouts, with the content-serving capability being the suspected limiting factor. Suppose the engineering team wants to enhance the system's capacity to support 1,000 simultaneous users. Doubling the content servers (the suspected limiting factor) might not be enough to scale the overall application capacity because other system components may fail before reaching 1,000 users. For example, the user authentication system might only support 600 concurrent users, or the payment processor might handle only 800. Capacity planning for application is only possible with testing and determining the limitations of each subsystem.

Don’t forget the time variable

Be sure to run API load tests for a variety of durations. A system may perform perfectly under a specified load for an hour, a day, or even longer, but how does that system perform after operating for a week, a month, or a year? Does the system have a defect, like a memory leak, that could cause it to degrade gradually?

Answering these questions requires an extended API load test. These types of soak tests do not always have specific pass/fail criteria but instead measure data trends. Test engineers should measure metrics such as available memory, disk space, user response times, and database latency to determine system degradation over time.

Leverage existing tools

Quality assurance teams have their hands full documenting testing scenarios and building test cases. They shouldn’t worry about architecting and scaling a test infrastructure, especially when scalable software-as-a-service solutions provide the necessary testing platform for virtually any web application and middleware technology. Of course, these tools cost money, but the costs are likely dwarfed by the time savings and value they provide.

Here are some criteria to consider when evaluating API load-testing tools:

Can a new test be started within minutes and hours of adopting the tool, or does it take days and weeks of training and preparation?
Does the tool use a programming language familiar to the development team and test engineers?
Is the tool configurable via an intuitive user interface once the test scripts are completed?
Does it support various API protocols like REST and GraphQL?
Does the tool leave the users responsible for provisioning the test infrastructure? Or is it offered as a hosted software service that outsources the burden of infrastructure provisioning?
Does the tool make it easy to test different middleware components of the application environment, like the message bus or database?
How well does the tool report the test results? Is it easy to create graphs and reports?

Multiple is a load-testing software service that meets all the selection criteria described above. Users can create API load tests using JavaScript and test any infrastructure component, such as MySQL or Kafka, supported by the Node Package Manager (NPM) ecosystem, within minutes of signing up for the service. The test can scale to thousands of users in seconds, and the metrics chart allows users to report on the test results, as discussed in the next section.

A JS code editor in Multiple's dashboard — Multiple presents a simple user interface and uses JavaScript (Source: Multiple)

Record, iterate, and improve over time

API load tests reflect the capacity of an application or system at a given time. They are no longer valid after engineers change the environment’s configuration or upgrade the software release of middleware components.

Just as QA teams perform functionality regression tests after each code release, they should conduct load tests regularly to check the capacity and performance of the system under changing conditions.

While “run on every pull, request before merge” is likely too frequent a pace for API load tests, each development team should agree upon a cadence for running and re-evaluating load tests. For example, one team may decide to rerun the test after every major release, while another might prefer to test each time they refactor a microservice or upgrade a database. The frequency depends on the team’s available testing resources and risk tolerance. Conducting a load test becomes easier if the test is fully automated and a provider hosts the test environment.

Documenting the results of each load test will help the team compare them over time and determine the impact of infrastructure and application architecture decisions.

Multiple’s automated load testing platforms document results as reports (source)

Beyond metrics and graphs, test engineers should also document the configuration parameters of a particular test (e.g., the number and types of simulated users or the code version) and keep notes for future reference, especially if team members may change over time, or if they expect weeks between tests. Just as DevOps teams document retrospectives (i.e., post-mortems) after each incident, test notes should include color commentary about the results beyond the test results, so they can better remember the areas of concern and improvement when they run the same tests in the future or expand on them.

Summary

Read the guide

CHAPTER

API Performance Testing

Learn seven important best practices for implementing API performance testing, such as defining realistic test cases and measuring key performance metrics.

Read the guide