How NetWisdom Helped Troubleshoot a SAN

In a claim that sounds like something out of an H.G. Wells novel, Virtual Instruments
says it has addressed SAN invisibility.

“A SAN is basically a
cloud that you can’t see inside,” said Mark Urdahl, president and CEO of Virtual
Instruments, of Scotts Valley, Calif. “When you add in virtualization, you have
a further layer of abstraction and complexity.”

This leads to much wasted time, as storage administrators must hunt around trying to
find out what the problem is. That’s why so many organizations tend to overbuild their
hardware infrastructure – to compensate for a lack of prediction.

“Our monitoring software helps you to see what is going on inside the SAN,” said
Urdahl.

As a company, Virtual Instruments was carved out of Finisar Corp. Its NetWisdom
software already existed, but Finisar never really marketed it, said Urdahl.

Traditional storage resource management (SRM) goes
only so far, he said. It optimizes the efficiency and speed with which drive space is
utilized in a SAN. It adds automation to functions like data collection and storage,
provisioning, forecasting of future needs, and maintenance of activity logs.

But while SRM products provide some management capabilities, they don’t really offer
the level of diagnostic sophistication needed to prevent outages and slowdowns. NetWisdom
gives such a view, said Urdahl, dealing as it does with the monitoring of I/O performance, bandwidth
utilization, as well as average I/O completion times. It also verifies that changes in
hardware and configuration do not adversely affect key applications.

Urdahl reports that 95 percent of the company’s business is SAN monitoring. Most
customers have storage volumes in the ranges of 100TB to 5 PB, although some go as low as
40TB. Typically, these environments have massive data growth coupled with a reduced
headcount. Tools that take the time out of management and troubleshooting, therefore, are
in high demand.

Banking on NetWisdom

A global financial institution is a user of NetWisdom in its North American
operations. It has four data centers (three on the Eastern Seaboard and one in the West)
and requires high volumes of I/O processing.

“Our two data warehouses can tax our storage subsystem at a rate of 40,000 to 50,000
IOPS for hours on end,”
said Ryan Perkowski, SAN manager at the financial giant. “Currently, we have has 420 TB
of storage – an amount which doubles every 11 months.”

The organization uses EMC storage arrays (Symmetrix DMX, Clariion and Centera) along
with a combination of Cisco and Brocade switches and a large mainframe for transactions.
Its server population is 60 percent AIX, 20 percent non-virtualized Windows, 15 percent
virtualized Windows and Linux, and 5 percent of various other complexions. A virtual machine rollout
is ongoing.

“We were suffering from over-subscription on our Cisco switches due to the heavy
demand for throughput,” said Perkowski. “Cisco offered no tool to look at the throughput
of the SAN.”

Server latency was another issue, as well as demands for better performance on Oracle.
Perkowski explained that a slow query on Oracle has a ripple effect across the IT
infrastructure: Users tend to get fed up waiting and re-query, which only doubles the
length of the queue and adds to further Oracle delays.

The company attempted to gain greater insight into its SAN-related slowdowns using
Symmetrix Data Remote Facility (SDRF). But that didn’t deliver the required information.
Solution: throw lots of hardware at the problem. Perkowski said a steady stream of
additional host bus adapters (HBAs), switches and servers failed
to offer complete relief from throughput constraints.

“As we had no tool to look at throughput through the SAN, we didn’t know the
underlying cause,” said Perkowski. “As my applications were expanding, they were starting
to outgrow the hardware. We just didn’t know where to split things off.”

In a previous life, Perkowski worked for Finisar, so he knew about NetWisdom. His
argument for the product is that the network has a sniffer that lets you know the latency
for the round trip time. All storage has is the MB/sec rate, which he felt was inadequate
for accurate diagnosis and troubleshooting.

“NetWisdom opens up fabric blindness,” said Perkowski. “It helps us to maximize
virtual hardware loads, ensure peak application performance, verifies vendor marketing
claims and gives us unbiased answers to cut through vendor finger pointing.”

He uses the Virtual Instruments product in several ways beyond performance tuning. IT,
for example, was reticent about loading more than one mission-critical application on its
AIX servers. When the company virtualized its AIX infrastructure, NetWisdom data gave
Perkowski enough confidence to include two heavy hitter databases on the same AIX
box.

In addition, if a database has slowed down in the previous week, IT can now look back
at historical data and find the cause. A database optimization project, which took 140
hours before, can now be done in eight hours. Further, the company had the green light to
spend another $1.5 million on a storage hardware upgrade. NetWisdom found the underlying
cause of the slowdown, which meant that the disk array purchase could be postponed.

“NetWisdom is the single most useful tool in my SAN tool bag and is the most used tool
by my team,” said Perkowski. “It has saved us countless hours of troubleshooting and
given our whole department a new direction in improving performance.”

Article courtesy of Enterprise IT
Planet

Latest Articles

Follow Us On Social Media

Explore More