A Service Level Agreement (SLA) makes use of the knowledge of enterprise capacity demands, peak periods, and standard usage baselines to compose the enforceable and measurable outsourcing agreement between vendor and client. As such, an effective SLA will reflect goals for greater performance and capacity, productivity, flexibility, availability, and standardization.
The SLA should set the stage for meeting or surpassing business and technology service levels while identifying any gaps currently being experienced in the achievement of service levels.
SLAs capture the business objectives and define how success will be measured, and are ideally structured to evolve with the customer’s foreseeable needs. The right approach to an SLA results in agreements that are distinguished by clear, simple language, a tight focus on business objectives, and ones that consider the dynamic nature of the business to ensure evolving needs will be met.
1. Both the Client and Vendor Must Structure the SLA
Structuring an SLA is an important, multiple-step process involving both the client and the vendor. In order to successfully meet business objectives, SLA best practices dictate that the vendor and client collaborate to conduct a detailed assessment of the client’s existing applications suite, new IT initiatives, internal processes, and currently delivered baseline service levels.
2. Analyze Technical Goals & Constraints
The best way to start analyzing technical goals and constraints is to brainstorm or research technical goals and requirements. Technical goals include availability levels, throughput, jitter, delay, response time, scalability requirements, new feature introductions, new application introductions, security, manageability, and even cost. Start prioritizing the goals or lowering expectations that can still meet business requirements.
For example, you might have an availability level of 99.999% or 5 minutes of downtime per year. There are numerous constraints to achieving this goal, such as single points of failure in hardware, mean time to repair (MTTR), broken hardware in remote locations, carrier reliability, proactive fault detection capabilities, high change rates, and current network capacity limitations. As a result, you may adjust the goal to a more achievable level.
3. Determine the Availability Budget
An availability budget is the expected theoretical availability of the network between two defined points. Accurate theoretical information is useful in several ways, including:
- The organization can use this as a goal for internal availability and deviations can be quickly defined and remedied.
- The information can be used by network planners in determining the availability of the system to help ensure the design will meet business requirements.
Factors that contribute to non-availability or outage time include hardware failure, software failure, power and environmental issues, link or carrier failure, network design, human error, or lack of process. You should closely evaluate each of these parameters when evaluating the overall availability budget for the network.
4. Application Profiles
Application profiles help the networking organization understand and define network service level requirements for individual applications. This helps to ensure that the network supports individual application requirements and network services overall.
Business applications may include e-mail, file transfer, Web browsing, medical imaging, or manufacturing. System applications may include software distribution, user authentication, network backup, and network management.
The goal of the application profile is to understand business requirements for the application, business criticality, and network requirements such as bandwidth, delay, and jitter. In addition, the networking organization should understand the impact of network downtime.
5. Availability and Performance Standards
Availability and performance standards set the service expectations for the organization. These may be defined for different areas of the network or specific applications. Performance may also be defined in terms of round-trip delay, jitter, maximum throughput, bandwidth commitments, and overall scalability. In addition to setting the service expectations, the organization should also take care to define each of the service standards so that user and IT groups working with networking fully understand the service standard and how it relates to their application or server administration requirements.
6. Metrics and Monitoring
Service level definitions by themselves are worthless unless the organization collects metrics and monitors success. Measuring the service level determines whether the organization is meeting objectives, and also identifies the root cause of availability or performance issues.
7. Customer Business Needs and Goals
Try to understand the cost of downtime for the customer’s service. Estimate in terms of lost productivity, revenue, and customer goodwill. The SLA developer should also understand the business goals and growth of the organization in order to accommodate network upgrades, workload, and budgeting.
8. Performance Indicator Metrics
Metrics are simply tools that allow network managers to manage service level consistency and to make improvements according to business requirements. Unfortunately, many organizations do not collect availability, performance, and other metrics. Organizations attribute this to the inability to provide complete accuracy, cost, network overhead, and available resources. These factors can impact the ability to measure service levels, but the organization should focus on the overall goals to manage and improve service levels.
In summary, service level management allows an organization to move from a reactive support model to a proactive support model where network availability and performance levels are determined by business requirements, not by the latest set of problems. The process helps create an environment of continuous service level improvement and increased business competitiveness.