Traditional corporate organizational units such as finance, accounting and HR believe that the new upstart Information Technology (IT) should put away its childish models and adopt a more formal, standardized and mature structure. Faced with a sagging economy—combined with mounting (and aging) capitalized infrastructure costs—people in the executive suites have started listening, considering what can be done to streamline and mainstream IT management infrastructures.
The Information Technology Infrastructure Library (ITIL) is, at its heart, a tightly integrated, two-chambered generic standard separated into Service Delivery and Service Support. Service Delivery is concerned with strategic objectives and long-term planning, while Service Support is focused on tactical day-to-day activities.
|Service Delivery||Service Management|
|Service Level Management||Incident Management|
|IT Security Management||Problem Management|
|Finance Management||Configuration Management|
|Capacity Management||Change Management|
|Availability Management||Release Management|
This two-part series discusses how the ITIL model can be applied specifically to an enterprise network operations (netops) team. Part one details how ITIL can be implemented, while part two deals with the application of Service Management.
Service Level Management is your network operations team’s strategic front-end to internal customers, such as individual business groups. There is generally a Service Level Manager (SLM) who engages these customers and negotiates a support contract, called the Service Level Agreement, or SLA.
Network Operations SLM
SLAs for a netops team will general focus on the provision of network functions, similar to a utility. A standard SLA covers areas such as security, finance, capacity plans, availability and continuity plans. Each of these areas has its own manager who works with the SLM to provide input and direction for the SLA.
Along with negotiating the SLA with customers, the SLM is responsible for monitoring the SLA elements and reporting to customers when outages cause a breach in the SLA. This gives the customer a “single neck to choke” if there are issues. The SLM works with all the other service delivery managers to properly manage all elements of the SLA.
Tom is the netops SLM for his company. His morning starts with a face-to-face meeting with a representative from the finance department. The release management team implemented a switch upgrade at ten o’clock last night during the finance “red” window, in violation of the signed SLA. A billing report failed to run and Tom got a call in the middle of the night from finance asking what had happened. Now he has to convince finance that a process will be put into place keeping future changes from happening in the finance red window.
IS Security Manager
IT Security Management is concerned with maintaining the uninterrupted operation of the network through controls, incident handling and auditing; along with providing input into SLA management. This is generally a single person who owns the overall security plan for the network.
Javier, the IS security manager, starts every day reading through security blogs and news sites searching for vulnerabilities. Cisco released a report saying a version of its IOS has a zero-day telnet exploit. Sadly, this is exactly the IOS in use at his company, and he has confirmed the vulnerability in his lab. His day will be spent writing up a report detailing the probability and impact of this vulnerability along with the cost of a work-around plan to remove it.
IT Finance Management
IT finance management is generally a new concept for network operations, responsible for identifying costs, proposing prices for services, and providing financial data for negotiation; accounting for the expenditures of netops services. While budgeting and accounting are required elements of the job, actual charging (policies, methodology, and billing) is considered optional.
Mark owns finance management for netops. He has been concerned about the depreciation of his company’s legacy network. With the sharp decline in routers and switches, he has identified a point about six months out where it will be cheaper to deploy a new infrastructure and remove old and expensive gear from the books. Unfortunately, capital funds for the rest of the year are tied up in management’s pet projects. He has set up a meeting with his management team in the afternoon to look at options.
Availability Management is an area netops already has significant experience with; managing service quality concepts such as availability,through mean time between failures (MTBF); and restorability, such as mean time to repair (MTTR). ITIL also engages the availability manager on resilience, or the ability of a piece of gear to survive a component failure; and maintainability, or the ability to keep a network device tuned for the environment.
Fred spends a lot of time looking at network management graphs and Excel sheets. Tom the SLM wants to report to all his customers on MTBF and MTTR for the last quarter. There were three separate power supplies that failed, but each device fortunately had a secondary, fully powered supply to support the chassis. Any of these three outages would have blown the SLA, but Fred’s insistence on redundancy kept the network up and running within SLA guidelines.
Capacity management is another concept most netops personnel are familiar with. With a forward looking perspective, the capacity manager is constantly aware of how many resources are free now and will be in the future. Capacity management is a tight-rope walk between keeping enough capacity open for expected growth while not allowing resources to be under-subscribed, wasting resources.
Dawn also spends a good part of her day reviewing network management graphs for trending analysis. While network utilization has been relatively flat, the current finance red window means that evening batch reports are spiking utilization. Normally this isn’t a problem, but a new R&D lab has been running a heavy load into the datacenter network and if both systems kick off at the same time, there could be issues. She has a lunch meeting with Tom to lay out the issues so he can work with both groups to find a solution.
Continuity management is concerned with two key areas; mitigation of risk to the network environment, and contingency planning for all foreseeable disasters. The continuity manager owns all network recovery plans including a business impact analysis, risk assessment, and continuity strategy. When a disaster strikes, the continuity manager is responsible for enacting the recovery plan. Between disasters, the continuity manager is also responsible for running simulations and tests to confirm the plan will meet the requirements laid out in all of the SLAs.
Gerard is always thinking about disaster. Business continuity planning for tornadoes, earth quakes, disgruntled employees and power outages takes the majority of his day. Upper management has given him a list of systems that cannot be off-line for longer than twenty-four hours, no matter what happens. He is having lunch today with a vendor who promises to offer a “Hot Start” off-site datacenter at a reasonable cost. He is hoping Mark from IT finance can also attend.
The six functions of service delivery rely on each other to be successful, each providing a supporting role into the Service Level Agreement. But this is just half of the picture; our next article will diagram service management and review how each of those elements handles day-to-day support of the network environment.