Microsoft Clustering

Microsoft provides two types of cluster:
1. Network load balancing
2. Windows Fail over cluster (For 2008 on wards servers), Microsoft Cluster service (For 2003 and before servers).
First one is to distribute the network traffic on servers Using virtual IP address.

NLB (Network Load Balancing)
Suppose there are two nodes in NLB , both the nodes will host the website. Client accessing the website will be redirected to anyone of the node depending upon configuration of NLB (By default it works on round robin methods)

Microsoft Cluster Service (MSCS)

Unlike NLB , in MSCS, resource will be online at only one of the nodes at a time .
Microsoft Cluster Service is based on the shared-nothing clustering model. The shared-nothing model dictates that while several nodes in the cluster may have access to a device or resource, the resource is owned and managed by only one system at a time.

There are mainly three Components of MSCS:
1.Cluster service
2. Resource monitor
3.Resource DLL.

Suppose You have cluster software to install, inside cluster server you will have these three components along with cluster administrator (GUI to manage the cluster.)

Cluster service:
The Cluster Service is the core component and runs as a high-priority system service. The Cluster Service controls cluster activities and performs such tasks as coordinating event notification, facilitating communication between cluster components, handling fail over operations and managing the configuration. Each cluster node runs its own Cluster Service.

Resource Monitor

The Resource Monitor is an interface between the Cluster Service and the cluster resources, and runs as an independent process. The Cluster Service uses the Resource Monitor to communicate with the resource DLLs.

Resource DLL
Every resource uses a resource DLL, through that DLL it communicates to Cluster service using Resource monitor.

How this cluster works:

Suppose we have two node MSCS cluster:
One which is owning the resource will be the active node.
Another one which is on stand by will be the passive node.

In a server cluster, only one node is active at a time. The other node or nodes are placed in a sort of stand by mode. They are waiting to take over if the active node should fail. The reason that it is possible for a node to take over running an application when the active node fails is because all of the nodes in the cluster are connected to a shared storage mechanism.
One of the shared storage in a cluster works as quorum.
What is a Quorum?

To put it simply, a quorum is the cluster’s configuration database. Although the quorum is just a configuration database, it has two very important jobs. First of all, it tells the cluster which node should be active.
It is extremely important for nodes to conform to the status defined by the quorum. It is so important in fact, that Microsoft has designed the clustering service so that if a node can not read the quorum, that node will not be brought online as a part of the cluster.
The other thing that the quorum does is to intervene when communications fail between nodes. Normally, each node within a cluster can communicate with every other node in the cluster over a dedicated network connection. If this network connection were to fail though, the cluster would be split into two pieces, each containing one or more functional nodes that can not communicate with the nodes that exist on the other side of the communications failure.
When this type of communications failure occurs, the cluster is said to have been partitioned. The problem is that both partitions have the same goal; to keep the application running. The application can’t be run on multiple servers simultaneously though, so there must be a way of determining which partition gets to run the application. This is where the quorum comes in. The partition that “owns” the quorum is allowed to continue running the application. The other partition is removed from the cluster.

Quorum is the basic concept to understand the cluster.
For Pre requirements , installation, and details of cluster you can refer to technet articles.

Windows Fail over cluster (For 2008 on wards servers):
In fail over cluster lots of changes has been done from MSCS.
1. Quorum Model has been changed.
2. In MSCS Cluster service was running under user account but in fail over cluster 2008 it has been changed to Local.
In Windows Server 2008 Fail over Clusters, the cluster service no longer runs in the context of a domain user account. Instead, the cluster service runs in the context of a local system account that has restricted rights to the cluster node.
During the Create Cluster process, a computer object is created in Active Directory Domain Services. This computer object is known as the Cluster Name Object (CNO).
The CNO creates all other Network Name resources that are created in a Failover Cluster as part of a Client Access Point (CAP). These Network Name resources are known as Virtual Computer Objects (VCOs).
3. Support of DHCP ip.
4. Can validate two servers before creating cluster using Validation tool.
5. Increased number of nodes.

Quorum Model :

Clustering supports four quorum modes. They are:
Node Majority,
Node and Disk Majority,
Node and File Share Majority, and
Majority: Disk Only (Legacy).
In failover cluster 2008 ,there is no particular quorum disk , in fact there are two witness resources who participates in voting along with nodes of the cluster.

-----------------------------------------------------------------------------------------------------------------------------
• Disk Witness Resource – A clustered disk can contribute towards the cluster’s quorum. This disk resides in the cluster group. Besides providing a vote for the quorum, this resource serves two other critical functions.
o Stores a constantly-updated version of the cluster database. This allows the cluster to maintain its state and configuration independently of individual node failures, which ensures that nodes will always have the most up-to-date copy of the database.
o The quorum resource enforces cluster unity, preventing the “split-brain” scenario described earlier.
• File Share Witness (FSW) Resource – A file share accessible by all nodes of the cluster can contribute to the cluster’s quorum. Besides provide a vote for the quorum, it also helps with the “split brain” scenario. However, file share witness doesn’t contain the cluster database.
• Vote – The quorum calculation is based on votes. Cluster nodes, disk witness resources and file share witness resources may have a vote base on the quorum configuration. The table in the next section shows the relationship between quorum mode and votes.
-------------------------------------------------------------------------------------------------------------------------
When the cluster is first created, the most appropriate quorum mode is automatically assigned which is based on the number of nodes and available cluster storage. This can always be changed.
The cluster will attempt to configure quorum so that there is always an odd number of votes. If there is an odd number of nodes, the cluster will select Node Majority as the quorum type to keep the odd number of votes. If there are an even number of nodes, and disks in Available Storage, the cluster will select Node and Disk Majority, giving a disk a single vote, so that there is an odd number of total votes. If there is an even number of nodes, but no Available Storage, the cluster will select Node Majority and issue a warning message. The cluster will never select Node and File Share Witness since it requires additional configuration, and it will never select No Majority: Disk Only as this is not recommended because it is a single point of failure.

Comments

Popular posts from this blog

Boot configuration Data Store --BCDEdit /set

ADSI Edit

How to analyze SFC /scannow logs