Back to Basics: Five Steps to Handling Any Network Crisis
The network's down, and it's your job to get it back up. Use our foolproof guide to resolve any network crisis.
Argh! One of the area managers called: his entire staff is without network access. Email's down, the interface to a critical sales application doesn’t work, and the company's losing money by the minute. What do you do?
Hopefully, you won’t encounter enterprise IT disasters often, but when they do strike, you need to act quickly and calmly to resolve the problem. Here’s our foolproof five-step guide to handling any network crisis.
Step 1: Diagnose the Problem
What happened? Get your subject matter experts together and uncover exactly what went wrong. Who first identified it? What measures, if any, did they take to fix it? The best way to work out what happened is to investigate as a team. Bring the team together, preferably in the same room, to walk through the situation and diagnose the problem. Ask questions. Get log files. Look at performance reports and error messages. Use all your diagnostic tools. It might be difficult to determine the root cause, but you can do it.
Step 2: Understand the Impact
Once you know what caused the problem, you also need to know who it affects. Is everything down? Or just one site or application? Does the problem only affect people using a certain interface? Check in with the service desk and see who's submitted a ticket about a downed connection. And while you’re at it, warn them that they could be receiving a lot more calls about the problem.
Understanding who's affected helps you appreciate how big an impact this issue has, and that helps you prioritize. Generally, if only affects one person, you’ve got more time to spend putting it right than if it affects multiple sites or users. Unless, of course, the one line affected is the VPN to the CEO’s house…
Step 3: Action Planning
Now for the big question. What are you going to do about your network crisis? If it’s a real crisis, it's unlikely that you’ll be able to fix it by yourself, so call on your team again to help. Together, you'll probably find several solutions, some faster to implement than others. Agree on your potential resolution plans before anyone takes action.
You’ll normally find that you need to implement a quick fix to get the business back up and running, and then a more comprehensive, robust solution afterwards (or in parallel) so that you’ve got a strong fallback position and don't rely on sticky-tape solutions for too long.
Remember to allocate actions to your team members so that everyone knows what they have to do next, and set target deadlines for completion of the tasks.
Step 4: Get to it!
Now everyone knows what they have to do. Get on and do it, but also organize regular status meetings so that you can all report back on progress. Some parts of the solution may take less time to implement than others, but try to keep everyone in the loop when it comes to providing updates. Include those members who have finished their tasks. They may have something valuable to add.
Start network monitoring as soon as you can and keep looking at your diagnostic tools so that you can see whether what you're doing actually makes a difference.
It’s a good idea to update the service desk with your progress and plans so that they have something concrete to report to end users who call up to complain or report faults.
Step 5: Review
The crisis is over. The business is running normally again. So how well did your solution work? Don’t stop your crisis management when you resolve the crisis. Now that things are calmer, review what happened to help make sure it doesn’t happen again. There are likely to be lessons to be learned from the experience. Share these with everyone on the team and, if you can, spend some time putting preventative measures in place to avoid this sort of disaster in the future.
With any luck, you aren’t reading this guide because you need it right now. Keep it where you can easily access it if and when the time comes. Share it with your team so that they also know what to do when disaster strikes. Then you can all be confident in your ability to deal with anything that comes your way. The best network teams are prepared for the worst, even if the worst never happens!