A Service Level Agreement (SLA) is a formal contract between a service provider and a customer that defines the expected level of service. It outlines key details such as performance standards, availability, responsibilities, and response or resolution times. SLAs usually include the penalties, terms for cancellation and outcomes a provider will face if a SLA is not met.
SLAs are common in IT services, cloud computing, and customer support, ensuring accountability and transparency. They help align expectations, provide measurable benchmarks, and serve as a reference point for addressing issues or disputes when service levels are not met.
Customer-level SLA
Customer-based SLAs cover the agreement between a service provider and a customer. Customers can be internal or external. Customer-level SLAs may cover multiple services, products and systems.
Service-level SLA
A service-level SLA is the contract that covers an identical service offered to multiple customers. These are particularly common for public cloud services.
Multi-level SLA
A multi-level SLA is a contract split into different levels to incorporate more than two parties, or different levels of service, into the same agreement. These type of agreements are common when a provider offers a tiered product or service in different SKUs / pricing plans, many SaaS products outline multi-level SLAs.
Operational SLA
An Operational SLA is usually an internal agreement within an organization that defines the performance standards and responsibilities between different teams or departments. Unlike customer-facing SLAs, which outline commitments to external clients, an operational SLA focuses on internal processes that support the delivery of services. For example, an IT operations team might commit to resolving server issues within four hours or ensuring system backups complete daily. Operational SLAs help improve accountability, streamline workflows, and ensure internal efficiency. Operational SLAs are usually based around their own Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
The SLA (Service Level Agreement) is a formal contract between a service provider and a customer. A SLA will usually contain a definition of the minimum level of service promised (e.g., 99.9% uptime per month).
A SLO (Service Level Objective) is internal goal or target that supports the SLA, it is usually more detailed and often stricter than the SLA. For example: SLA promises 99.9% uptime, but the SLO target may be 99.95% to give a safety margin.
SLIs (Service Level Indicators) are the actual measurements/metrics used to determine if you’re meeting the SLO or SLA. For example: the percentage of successful HTTP requests, average response time, or system availability.
An Example (SLA vs SLO vs SLI)
Here, the system met both the SLO and SLA.
A Service Level Objective (SLO) is an internal reliability target that defines the expected level of service for a system or application. To make it effective, an SLO typically includes several key components:
Together these elements can clearly define a quantifiable SLO such as:
SLOs inherently define a budget for the repair of issues arising. For example a SLO of 99.99% uptime over 30 days implies you’d need to measure the downtime your service experiences over a month and to ensure it’s less than 4.32 minutes. Tools are available to calculate the length of time that uptime % corresponds to, for example: uptime.is or slatools.com.
Service Level Agreements (SLAs) provide clear value for both service providers and customers by defining expectations and measurable outcomes. The main benefits include:
A number of common elements are usually included within Service Level Agreements to define the scope, expectations, and accountability between a service provider and a customer. The most common elements include:
It is very common for organizations to procure multiple services from a vendor which are then used in series or parallel, with each service having its own associated SLA uptimes and commitments. This is a very common scenario when purchasing multiple services in cloud such as from AWS. You may like to explore how to combine SLAs to calculate a composite SLAs, information covering this is included in these third-party articles:
Composite SLAs – AWS Example
Imagine you have an application that uses:
If all three of these services are required and must be up for the application to function correctly, you would multiply their individual SLAs to get your composite SLA:
Composite SLA = 0.9995 * 0.9999 * 0.9995 = 0.998900349975, or 99.89%
Where 99.95% corresponds to 21min 55s monthly of downtime, 99.89% corresponds to monthly downtime of 48min 13s.