This guide documents the process of performing maintenance of RSF-1
cluster nodes with minimal downtime. Maintenance includes the
following possible scenarios:
Upgrading the base OS.
Upgrading RSF-1 software.
General hardware maintenance.
The maintenance process can be broken down into the
following steps:
Set services to manual on all nodes.
Move all services to one cluster node.
Perform maintenance on the non service running node(s).
Check cluster health once maintenance is complete on a node.
Move services over to the upgraded node.
Repeat the process for the next node.
Final steps.
1. Set services to manual on all nodes.
Setting all services to manual is a safety measure to prevent any
unwanted failover/migrations during this process. Note, this
action will NOT stop any running services.
Using the Webapp
Select Dashboard on the main menu, then
select ⋮ on the Services panel, and then Set all services to manual
Using the CLI
# /opt/HAC/RSF-1/bin/hacli service manual --name <servicename> --node <nodename>
{
"timeout": 60,
"errorMsg": "",
"execTime": 0.032,
"error": false,
"output": "Putting <servicename> in manual mode on appliance <nodename>"
}
Note that when using the CLI it is necessary to set all services to
manual on each node. For example if we have ServiceA clustered on
nodes NodeA and NodeB, it is necessary to issue the following
CLI commands:
All running services should now be moved to a single cluster node so
the other node(s) in the cluster are free for maintenance
procedures to be performed.
Using the Webapp
From the main dashboard select a service on the Services panel
and then select ⋮ next to the running instance and finally Move <service> to <node>.
This operation needs to be performed for each running service.
Using the CLI
[root@mgc71 ~]# /opt/HAC/RSF-1/bin/hacli service move --name <servicename> --dest <nodename>
{
"timeout": 60,
"errorMsg": "",
"execTime": 7.055,
"error": false,
"output": "Service <servicename> is now moving to node <nodename>"
}
3. Perform maintenance on the non service running node(s).
With all services running on a single node, maintenance may now be
performed on the other node(s) in the cluster.
Before performing maintenance consider undertaking the following where
applicable:
If possible perform a snapshot/backup of the complete system.
If applicable, make copies of all licenses.
If performing hardware upgrades/additions, if possible verify compatability and
check for conflicts.
Best practice is to perform a test reboot once maintenance
is complete.
4. Check cluster health once maintenance is complete on a node.
Once maintenance has been performed on any cluster node, check that
the node has sucessfully rejoined the cluster and is operating
normally using the following checklist:
All heartbeats are up and running.
All cluster nodes can see each other and agree on the state of services.
On the upgraded node all services should be marked as manual and
unblocked.
Clients of cluster services are still operating as normal (this is
specific to the individual setup, i.e. NFS/SMB/iSCSI shares,
application clients etc).
Using the Webapp
Navigate to Dashboard - cluster health overview should all be OK.
If RSF-1 has been upgraded navigate to Help==>About and check the
version displayed is correct.
Using the CLI
Interrogate the status of the cluster and check the health fields of
the returned object (there should be no alerts and all fields should be
marked OK).
Once the previous upgrade step has completed sucessfully, services can
now be moved to the newly upgraded node and maintenance performed on the
other node(s) in the cluster.
Using the Webapp
From the main dashboard select a service on the Services panel
and then select ⋮ next to the running instance and finally Move <service> to <node>.
This operation needs to be performed for each running service.
Using the CLI
Again, this will need to be performed for each running service:
# /opt/HAC/RSF-1/bin/hacli service move --name <servicename> --dest <nodename>
{
"timeout": 60,
"errorMsg": "",
"execTime": 7.051,
"error": false,
"output": "Service <servicename> is now moving to node <nodename>"
}
6. Repeat the process for the next node.
Once the running services have been migrated to the upgraded node and
confirmed to be operating as expected, the other cluster node(s) can
now be maintained.
7. Final steps.
Once all nodes in the cluster have been sucessfully
upgraded, services can be migrated to their normal
nodes. Once the migration is complete a final check on the services
should be perfromed from a client perspective.