Application Programming Interfaces (APIs) are the de-facto standard that software developers use today to enable communication between otherwise incompatible software components from different platforms. This capability has fueled a larger API economy, which has a direct impact on the overall performance of your applications - affecting your end user’s experience and overall opinion of your product. To ensure performant APIs, businesses must make room for an API monitoring phase in their API Lifecycle to improve performance. Consistent monitoring helps keep uptime high and outage rates low across all applications and services. In this guide, we explore industry best practices for API monitoring, including which metrics to prioritize and how to debug when problems arise.
An API is a set of functions or procedures that governs the access points for a given system, service, or application. Currently, the two main competing approaches offering extensibility to APIs for creating web services are Simple Object Access Protocol (SOAP) and Representational State Transfer (REST). GraphQL has become a popular alternative to REST in the past few years for specific use cases. There are also a few lesser-used alternatives.
SOAP APIs are strictly based on XML and HTTP protocols. Sending a SOAP request is like using an envelope to send a message. SOAP APIs consume extra overhead and more bandwidth, and require more work on both the client and server ends. That said, like envelopes, SOAP encloses more stringent security compared to REST.
If SOAP is like an envelope, REST is more like a lightweight postcard. REST APIs are considered the gold standard for scalability and are highly compatible with micro-service architecture.
REST APIs are high-performing (especially over HTTP2), time-tested, and support many data formats. REST APIs also decouple the client and server, ensuring independent evolution. However, building a true REST API is difficult because it requires a disciplined adherence to the Uniform Interface constraint. Many organizations trade off the long-term benefits of a truly REST API for HTTP APIs that have similar benefits but adhere to REST constraints more liberally.
Cloud providers like AWS offer services such as API Gateway that help other vendors create REST API endpoints.
As mentioned earlier, HTTP APIs can be very similar to REST APIs. HTTP APIs are ranked by the Richardson Maturity Model for how compliant they are with REST constraints. In reality, most APIs fall somewhere in this category.
GraphQL APIs are contract-driven and come with introspection out-of-the-box. Building an API with GraphQL is very easy in comparison to true REST APIs, which require extensive knowledge of HTTP to build intelligently.
Their downside, however, is that they do not scale well and require tight coupling between the client and server. GraphQL queries get more expensive to parse and execute plans for as they get bigger--and they also lack certain concepts native to HTTP, such as content and language negotiation.
API monitoring is the process of continuously checking for both the availability of your endpoints and the validity of their data exchanges. While monitoring your APIs, you also gain visibility into how your APIs operate in terms of performance (e.g., time to respond to a request made from various locations or to queries of increasing complexity).
You monitor APIs to detect a failed or slow application transaction before your end-users report the problem.
Nowadays, web applications rely on APIs that form abstraction layers between the micro-services that make up your application, as well as third-party services to embed additional functionality that will enrich user experience. This architecture leads to dependency on complex multi-step transactions and third-party API integrations.
For example, if a third-party search widget on your e-commerce site fails, your customers will be unable to browse through your store. If the APIs connecting to the payment gateways fail, you lose both customers and revenue. Therefore, monitoring and controlling your API is crucial to ensure success in every step of the transaction in your application.
API monitoring helps measure the reliability of transactions. Software that lets you monitor APIs can detect and alert on errors, warnings, and failed API authentications to ensure secure data exchange.
Almost all SaaS vendors provide their customers with APIs that can be leveraged to manage configuration, data, and outputs. The schema versions of these API endpoints are updated over time as the SaaS platforms evolve. Therefore, the APIs should be tested on a regular basis to ensure that your application code doesn’t fail when a new schema version is released.
As mentioned earlier, third-party API integrations eliminate the need for duplicate data entry by fetching information that is externally managed. However, bugs that cause timeouts, latency in API calls, errors, and downtimes for API endpoints dependent on third-party API integrations can degrade internal API’s overall performance. Luckily, these problems can easily be identified with API monitoring.
API monitoring allows API performance evaluation from multiple perspectives (e.g., DevOps, QA, development). A DevOps perspective might focus on the scalability of the performance load of a query or many queries, while a QA perspective might examine the literal data exchanged to validate the structure and expected results. In this way, API monitoring can be used to inform many initiatives across the organization, making for an efficient optimization tool.
API availability or uptime is a gold standard in API monitoring. At the same time, monitoring availability alone is not enough for API transactions involving data exchange. In other words, you have to test various verbs such as Create, Read, Update, and Delete (CRUD) services against all of the application resources that are exposed via the API to ensure that they are operational.
Using synthetic monitoring tools with multi-step API monitors is one way you can improve API availability with data reliability. Just remember, synthetic monitoring uses only a predefined set of API calls. Therefore, real-world traffic can be different from the inputs in synthetic monitoring.
There can be other internal and/ or external APIs dependent upon the input or output of your own application’s APIs. Even though it is true that you have implemented an API monitoring strategy, other APIs may or may not have one. Therefore, you should also monitor the behavior of the third-party APIs on which your application depends.
CI/CD and DevOps movement encourage continuous testing and AUTOMATED testing. You can define a clear API monitoring strategy for every stage of the CI/CD pipeline and routine monitoring at regular time intervals. This cycle will enhance the API performance of your prototype at every stage of your code release process.
Tools with only metric visualization need constant human-intervened monitoring to know if anything went wrong to handle and debug API errors. Thus, a tool with a strong alerting capability should be a priority when selecting an API monitoring tool.
API availability or uptime is a percentage measurement that is often represented as 99.9% or 99.99%. Sometimes, the same is calculated as downtime-per-year as an overall average.
High CPU usage and memory usage of the API host server is a sign of an overloaded virtual machine, container, or API gateway node. This would slow your API performance.
You can measure CPU usage across a cluster that hosts your application’s API code, as well as the number of processes waiting to run which is also known as CPU load or run-queue-size. Memory can be simply measured as a percentage of available memory that is in use.
API Consumption measures as requests-per-minute, requests-per-second, or queries-per-second. You can batch multiple API calls into a single API call with a flexible pagination scheme to lower the API consumption.
Note that synthetic monitoring isn’t meant to measure the consumption rate, since it emulates individual transactions instead of monitoring the aggregated volume of transactions. The telemetry instrumentation to measure and report the consumption rate is typically engineered into the API’s design at the onset or monitored with an Application Performance Monitoring (APM) tool.
Response time is a tricky metric to measure with third-party APIs because the recording latency may be an aggregation of both problematic slow endpoints and the network itself. The best approach to monitoring the latency is to use an API monitoring tool that can separately report the network latency and the API response time.
The size of the payload (the JSON file posted to or retrieved from the API) has a large impact on the latency. This is why synthetic API monitoring should be performed with both small and large payloads.
Error rates (like errors-per-minute and error codes) give you granular details in tracking down problems in individual APIs. For example, error codes in the 400 to 500 range imply problematic APIs or web service providers.
However, there can also be faulty APIs responding with an 200 OK status that was not designed using the correct HTTP status code. Synthetic monitoring tools can compare the result of a test with an expected value to confirm the accuracy of the API response, beyond the status code.
Unique API consumers metric provides insights on the overall growth and health of new customer acquisitions based on monthly active users count. A sudden drop of consumers during peak operating hours is interpreted as a symptom of an underlying application platform problem.
The easiest and the first method for tracking down problems with APIs is to check the HTTP status code. A 400 bad request means an API request with invalid syntax that you probably have to review for typos.
401 requests have invalid or missing authentication credentials that can often be resolved with a proper authentication such as an OAuth token. Other common mistakes include forgetting the space in a prefix, or adding the required colon after a username even if there is no password.
In a scenario where the intent is to check the API’s availability only, it would be acceptable to "assert" (or accept) a 401 code since. This is because even though "401 unauthorized" was received, it means that the API was available.
Checking the API response code and applying the corresponding debugging method can sometimes fail to resolve API errors. In those cases, check and compare HTTP headers for additional information. Some APIs accept requests that don’t contain Accept for Content-Type information. However, many require this to be specified.
JSON schemas are used to document API endpoints. JSON parsing tools can be used to debug API endpoints. These tools let you create tests for API endpoints and validate syntax.
As discussed above, API monitoring is integrable into the test automation process on your CI/CD pipeline. Therefore, the tool you select to use to monitor and control APIs should be able to integrate with your CI server (e.g., Jenkins or Github integration).
Some tools use third-party SaaS platforms that require you to open certain ports on your firewall to monitor internal APIs that are not publicly reachable. These, in turn, expose a security risk. That’s why it is so important to choose the right API monitoring solution, taking into consideration the API type you want to monitor and control. Tools able to exercise your private APIs from inside your firewall are best suitable for this use case.
API testing followed by API monitoring creates a comprehensive end-to-end API performance evaluation process for applications. That’s why it is to your advantage to use a tool that can provide both testing and monitoring functionality. With such a solution, your team has a 360-degree view of API quality and performance on a single screen.
API monitoring and testing are not daunting tasks when you pair the right techniques with the right tools. Accurate API monitoring and testing data will enable you to improve application performance, avoid production outages, and improve customer satisfaction.
Develop a killer DEM strategy.
Our one-page checklist will help you determine your monitoring strategy and data analysis essentials.