A service-level objective (SLO) is the part of a service-level agreement (SLA) that documents the key performance indicators (KPIs) the customer should expect from a provider. In addition to specifying details about the service being purchased, an SLO also documents what the consequences will be if SLOs are not achieved.
An SLO contains measurable metrics which are documented and shared between stakeholders in order to provide accountability, produce consistent quality of service (QoS) and demonstrate continue commitment. SLOs are based on service level indicators (SLIs), which are the metrics chosen to be measured.
SLOs are useful for business leaders to define internal goals for maintaining a consistent service level and help avoid disputes around the expectations between a service provider and an end user. Common examples of metrics that can be associated with SLOs are disaster recovery time, application availability, live communication response time, first call resolution rate and application maintenance.
How it works
SLOs are intended to define a range of what is most and least acceptable for performance standards. As a step by step process, forming an SLO should begin with defining key metrics, or SLIs.
For example, if latency of a service is the SLI, the SLO may state how many requests will be completed per millisecond and define the acceptable error rate. Further, the SLA would outline what the customer is entitled to if this objective is not met.
Once a service provider and end user agree upon the metrics, providers should begin to collect and monitor those metrics (monitoring software such as Nagios or Datadog can be used for this). Alerts should be set to notify the service provider to notify them when the specified SLO value dips below its acceptable rate. Reports can then be made and analyzed as an SLI based on those measurements.
Examples of SLO measurements
Each objective corresponds to a single performance indicator. The SLOs chosen can vary depending on the importance of the service to the end user, resources available and budget. SLOs commonly measured include throughputs, frequencies, response times, latency, availability and completed support calls. SLOs can be measured based on the level of achievement, which can be shown as an average, rate or percentile.
There is no limit to how many SLOs can be included in each SLA, but it should be restricted to only measurements that are relevant to the consumer. In addition, a service provider should avoid measuring internet based clients and HTTP requests as latency may vary wildly based off of internet connections.