So, you’ve got to install Microsoft Cluster Service (MSCS) in your Windows Server environment.
You may have been tasked with providing failover capability for storage, print services, directory
services, Exchange, SQL Server, file systems or with providing
high-availability service to Microsoft-native or third-party applications. We
won’t cover all the nuances here, but we’ll try to provide a decent
overview for you.
Windows 2000 Advanced Server supports 2-node clusters, while Windows
Server 2003 Enterprise Edition supports 4-node clusters and Datacenter
Edition supports 8-node clusters. Datacenter Edition is typically only
installed by hardware vendors such as Dell and Compaq as part of a
total solution rather than by end users or corporate customers
themselves.
Before purchasing or using servers, storage host bus adapters (HBAs),
controllers, switches, drives and enclosures, or attaching to a SAN,
consult Microsoft’s Hardware Compatibility List (HCL)!
If it’s not there, don’t use it for your cluster servers
or storage. Another reference you ought to read thoroughly and print
out, especially if this is your first MSCS installation,
is the Technical Overview of Microsoft Windows Clusters which covers basic installation. A whole slew of MSCS documentation is available online.
Host Bus Adapters and Network Adapters
Host bus adapters for Microsoft Clusters must be PCI-based. You
cannot use older ISA or EIDE storage adapters. Also, storage adapters
in both nodes must be identical. In Windows 2000 Advanced Server, they
must be separate from the adapter of the operating system’s boot disk.
In Windows Server 2003, shared disks can be located on the same storage
interconnect as the boot, page file and dump file disks.
Typically, clustered servers will be deployed with multiple HBAs in a
highly available storage fabric. In these cases, be sure to always load
the multi-path driver software (such as Compaq SecurePath or EMC
PowerPath) to prevent data corruption.
MSCS Cluster Name and IP Configuration (Click for a larger image) |
For network adapters on the cluster nodes, there is little reason not
to use Gigabit Ethernet for the public interfaces if possible. These interfaces are
what clients will be using to physically connect to your cluster.
Logically, they’ll connect through a virtual IP address (define) that resolves
to the Cluster Name, all of which you will configure when you set up
your cluster.
You can use whatever you have on hand or the built-in Ethernet
interface for the heartbeat. The heartbeat is a private network
between nodes for exchanging cluster status information. Just be sure
they all are using the same type of hardware.
When configuring the network interfaces in a cluster, configure them as follows if possible:
MSCS Network Interface Configuration (Click for a larger image) |
All Cluster Communication – public This is usually in a separate IP
address space than the heartbeat and may be routable or private depending on your network configuration
Internal Cluster Communications – heartbeat This is usually a private IP address ie: 172.x.x.x
In this way, redundancy is provided by the public interfaces because
they can be used for the cluster heartbeat, should the heartbeat
interfaces fail. It’s not recommended that you leave them that way, but until
someone can troubleshoot and fix the private heartbeat interface, the public
one can be used.
Storage Disk Types
MSCS supports only SCSI and Fibre Channel connectivity for shared disk-
storage, and shared storage must be NTFS formatted. For clusters with
more than two nodes, or for 64-bit versions of Windows Server,
only Fibre Channel storage is supported. Note that shared storage
in a Microsoft cluster cannot be a Dynamic Disk (define). This
means you cannot use Windows software-based RAID on the disks that will
be used for shared storage on the cluster. You must use hardware-based
RAID, which is faster anyway. And that leads to an interesting problem with physical
and logical NTFS volumes: The 2 terabyte limit, which Dynamic Disks can overcome – but remember that
these cannot be used on an MSCS cluster for shared storage. You can find more on this 2 terabyte limit here.
If you are building a storage cluster and more than 2 terabytes of
shared storage are required, you may have to resort to products such
as Veritas Storage Foundation (formerly known as Veritas Volume Manager),
or connect to a SAN to configure your storage in a manner that
overcomes this limit by presenting the storage to the OS in a
manner that is abstracted from the NTFS filesystem. Most cluster
configurations these days use SAN, but
for those that do not, this 2 terabyte limit may become an issue if
there are larger storage needs and direct-attached shared storage is
used.
When configuring your storage, configure the same drive letters on each
node of the cluster for each logical drive. Do not allow more than one
node at a time to access those drives until the cluster software is
installed and configured on the additional nodes.
Continued on page 2: The Quorum Disk, Logging and Installation
Continued From Page 1
The Quorum Disk
Each MSCS cluster requires a quorum disk. The purpose of the quorum disk is
to store the cluster log file. The quorum disk must be a shared storage
device but separate from the rest of the shared storage. A standard
recommendation is to make it about 100-200 MB in size. With Windows Server
2003, you no longer need to manually select which disk is going to be
used as the quorum resource. It is automatically configured on the smallest
shared disk that is larger then 50 MB and formatted NTFS. You may move the quorum resource
to another disk during setup or after the cluster has been configured.
The Cluster Log File
In Windows 2000, the cluster log file size was initially set very small
(64KB) by default and often had to be increased in size. The
problem is that if the logfile fills before the entries can be snapshotted
and the log file truncated, the cluster fails or generates an error, so
make sure the cluster transaction logfile is large enough to handle the
regular activity of your cluster. In Windows Server 2003 Enterprise
Edition, the default size of the quorum log has been increased to 4096
KB and this is sufficient for most purposes.
Cluster Log File Configuration (Click for a larger image) |
Installing Microsoft Cluster Service (MSCS)
You install MSCS by using the Add/Remove Programs applet in Windows
2000 Advanced Server. MSCS is installed by default in Windows Server
2003 Enterprise Edition so you only need to launch the Cluster Administrator.
You can also script the configuration with Cluster.exe. Longhorn will have a
simple “Create Cluster” Wizard, but it will probably be around 2007
before we’ll see that.
Troubleshooting
With Windows Server 2003 Enterprise Edition, a setup log is created
during configuration of Cluster Service in
%SystemRoot%system32LogfilesClusterClCfgSrv.log to assist in
troubleshooting, and a new tool called clusdiag is available in the
Windows Server 2003 Resource Kit. Good Luck!