Vendor Neutrality in a Cloud Environment
The Denver Regional Transportation District (RTD) uses 1,050 buses and 125 light rail vehicles to move riders around the 40 municipalities in its six-county service area. To keep its 330,000 daily passengers on the go also requires moving a lot of electrons. To do this, the RTD relies on a network linking more than 2,500 nodes at 20 main facilities and kiosks at another 80. The main locations are joined using private fiber with trunked 1Gbps Ethernet. Others use Metropolitan Optical Ethernet (MOE), bonded T-1, ATM, DSL and ISDN.
It is a big enough job monitoring that type of network when it is using traditional client/server architectures. But the movement toward virtualization, thin desktop clients and cloud computing have added new challenges.
“Virtualization of the server space requires a change in monitoring these systems due to their method of resource allocation to the logical servers contained within the physical host,” says Scott Chapiewski, senior consultant at Synamon Corporation, the network consulting and monitoring firm that manages the RTD network. “Many different layers are affected.”
For example, with server virtualization, the RTD had to change its approach to monitoring how the host server virtual switching fabric attaches to the core switching equipment and how memory was being allocated to the now-virtual servers. Virtual desktops mean that each user has two machines – the VM in the server room and the workstation – either of which can cause a degraded user experience due to the reliance on remote desktop protocol (RDP).
“Both of these must be running well to create a positive user experience and maintain high end-user productivity,” says Chapiewski. “Due to the virtual nature of these desktops, it now becomes much more important for the organization to focus on monitoring aspects that could lead to issues like network latency, packet loss, or out of order packets.”
Finally, using cloud computing relieves an organization of some of its IT service load, but it doesn’t relieve IT of the need to monitor the user experience.
“It is critical to work with a partner that will allow you to work with them in providing the testing and monitoring your organization requires,” he says. “In all cases, you must ensure you have a very robust set of application layer tests in place, since this is your interface to the cloud.”
Assembling the Toolkit
Chapiewski follows his own advice. Over the last nine years he has configured more than 7,300 tests and monitors every piece of network-aware hardware – video surveillance systems, the building access control systems, UPS power systems, HVAC equipment, environmental monitors (water, heat, humidity), on-board vehicle telemetry equipment, kiosks, IT security systems, load balancers and time sources – in addition to the usual LAN/MAN/WAN, server and application monitoring. Some of the tests are simple, but many others require advanced configuration, such as some that require using a packet analyzer to decipher what a client application is doing in conjunction with the server being tested.
There is no single protocol or piece of network monitoring software that gives him all the data he needs. Simple Network Management Protocol (SNMP), sFlow, NetFlow and Application Performance Index (Apdex) all play a role. He licenses monitoring products from SolarWinds, Plixer, HP and others, in addition to his custom built tools. One custom application aggregates the data from the different monitoring sources into SQL Server.
“We initially focused on researching and investigating every technology that allowed us to collect information from our existing infrastructure, as we had a very large investment in this equipment and systems,” says Chapiewski. “Next, we looked to the various monitoring protocols and methodologies available in the marketplace to augment these technologies. Lastly, we conducted a cost analysis of the various monitoring and testing tools, and determined that it was the best value to select multiple testing applications that met our requirements and store this data into SQL server.”
The entire IT staff can then use two front end applications to access the databases and obtain to uptime/downtime data, current alarms, incident reports and several years of performance graphs. The applications organize the data by functional area – telecom, datacom, servers, environmental, power, facility access control, video security systems, ERP, email, firewall, custom applications, and many more.
“In all, we are running 7,300+ unique tests resulting in more than 3.5 million tests being conducted each day,” he says. “The monitoring systems alert staff to alarm conditions via email, paging and texting, so they can stay informed 24×7.”
He says that it is possible to set up an effective monitoring system at low cost, as long as you invest time in gaining the expertise required to set up the tests.
“Configuring tests requires you to constantly increase your knowledge of the many RFC’s (Request For Comments) and IEEE standards, as these are the basis for determining what needs to be tested and what constitutes a ‘bad’ status vs. a ‘good’ one,” Chapiewski says. “The bottom line is to ensure you understand the organizations requirements, and determine where you can get the best value for your money.”