Creating a ZFS HA Cluster on Linux using shared or shared-nothing storage
This guide goes through a basic setup of a RSF-1 ZFS HA cluster on Linux.
Upon completion the following will be configured:
A working Active-Active Linux cluster with either shared or shared-nothing storage
A clustered service sharing a ZFS pool (further services can be added as required)
A virtual hostname by which clients are able access the service
Introduction
RSF-1 supports both shared and shared-nothing storage clusters.
Shared Storage
A shared storage cluster utilises an common set of storage devices
that are accessible to both nodes in the cluster (housed in a shared
JBOD
for example). A ZFS pool is created using these devices and access
to that pool is controlled by RSF-1.
Pool integrity is maintained by the cluster software using a combination
of redundant heartbeating and PGR3 disk reservations to ensures any
pool in a shared storage cluster can only be accessed by a single node
at any one time.
Shared-Nothing
A shared-nothing cluster consists of two nodes, each with their
own locally accessible ZFS storage pool residing on non shared
storage:
Data is replicated between nodes by an HA synchronisation
process. Replication is always done from the active to the passive
node, where the active node is the one serving out the pool to clients:
Should a failover occur then synchronisation is effectively reversed:
Before creating pools for shared nothing clusters
To be eligible for clustering the storage pools must have
the same name on each node in the cluster
It is strongly recommended the pools are of equal size,
otherwise the
smaller of the two runs the risk of depleting all available space
during synchronization
Download cluster software
If not already done so, download and install the RSF-1 cluster software onto each
cluster node. More information can be found here.
Initial connection and user creation
Before starting
Please make sure that any firewalls in the cluster
environment have the following ports open before attempting
configuration:
If setting up a shared-nothing cluster, both nodes require
ssh access to each other without a password. This is needed
for the replication of the ZFS pool.
To connect to the RSF-1 GUI, direct your web browser to:
https://<hostname>:8330
Next, create an admin user account for the GUI.
Enter the information in the provided fields and click the
Submit button when ready:
Once you click the Submit button, the admin user account will be
created and you will be redirected to the login screen. Login with the
username and password just created:
Once logged in the main dashboard page is displayed:
Configuration and Licensing
Editing your /etc/hosts file
Before continuing, ensure the /etc/hosts
file is configured correctly on both nodes. Hostnames cannot
be directed to 127.0.0.1, and both nodes should be
resolvable. Here is a correctly configured hosts file for two
example nodes, node-a and node-b:
127.0.0.1 localhost
10.6.18.1 node-a
10.6.18.2 node-b
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
To begin configuration, click on Create/Destroy option on the
side-menu (or the shortcut on the panel shown when first logging in).
The Cluster Create page scans for clusterable nodes (those running
RSF-1 that are not yet part of a cluster)
and presents them for selection:
Now enter the cluster name and description, and then
select the type of cluster being created (either shared-storage or
shared-nothing).
If setting up a shared-nothing cluster an additional option to add a
node manually is shown at the bottom of the page. This is because
RSF-1 will detect nodes on the local network, but for shared-nothing
clusters, the partner node could be on a separate
network/location, and therefore may not automatically be detected1.
Trial Licenses
If any of the selected nodes have not been licensed,
a panel is shown to obtain 45 day trial licenses:
Next, the RSF-1 End User License Agreement (EULA) will
be displayed. Click accept to proceed:
Once the license keys have been successfully installed, click the
Create Cluster button to initialize the cluster:
When the cluster has been created, you can enable support for
disk multipathng in RSF-1 (if the disks are already configured)
and/or netplan (ubuntu) if required:
These settings can be modified after cluster set-up if needed.
They can be found in Settings -> Linux.
Enabling Multipath Support
If the disks have been configured to use multipathing you
must enable multipath support otherwise disk reservations
will not function correctly. Do not enable if disks are
configured for singlepath only.
The next step is to add pools to the cluster.
Creating a Pool in the WebApp
If a zpool isn't already created, this can be done via the
WebApp. Click Volumes on the side menu, then +Create:
Enter the desired Pool Name and select a Pool Mode (jbod, raidz2 or mirror).
Add your drives to the pool by selecting them in the list and choosing their
role using the buttons at the bottom.
To configure multiple mirrors in a pool, select the first set of
drives from the list and add them as data disks. Next select your next
set of drives, and click data then New mirror:
Once configured, click submit and your pool is created and ready to be
clustered:
Preparing Pools to Cluster
Pools must be imported on one of the nodes before they can be
clustered. Check their status by selecting the Volumes option on the
side menu.
Shared-nothing clusters
For a shared-nothing cluster, the pools will need
to have the same name and be individually imported on each node
manually.
In the above example pool1 and pool2 are exported, snpool is
imported. To import pool1 first select it:
The select Actions, followed by Import Pool:
The status of the pool should now
change to Imported and CLUSTERABLE:
Unclusterable Pools
Should any issues be encountered when importing the pool it will
be marked as UNCLUSTERABLE. Check the RestAPI log
(/opt/HAC/RSF-1/log/rest-operations.log) for details on why the
import failed.
With a shared-nothingcluster, this may happen if
the pools aren't imported on both nodes.
The pool is now ready for clustering.
Clustering a Pool
Highlight the desired pool
to be clustered (choose only pools marked CLUSTERABLE ), then select Actions
followed by Cluster this pool:
Fill out the description and select the preferred node for the
service:
What is a preferred node
When a service is started, RSF-1 will initially attempt to run it on it's
preferred node. Should that node be unavailable (node is
down, service is in manual etc) then the service will be started
on the next available node.
With a shared-nothing pool the GUID's for each pool will be shown:
To add a virtual hostname to the service click Add in the Virtual
Hostname panel. Enter the IP address, and optionally a hostname, in the
popup. For nodes with multiple network interfaces, use the drop down
lists to select which interface the virtual hostname should be assigned
to. Click the next button to continue:
Finally, click the Create button:
The pool will now show as CLUSTERED:
View Cluster Status
To view the cluster status, click on the Dashboard option on the side-menu:
The dashboard shows the location of each service and the respective pool
states and failover modes (manual or automatic). The dashboard also allows
the operator to stop, start and move services in the cluster.
Select a pool then click the ⋮ button on the right
hand side to see the available options:
Cluster Heartbeats
To view cluster heartbeat information select the Heartbeats option on the left
side-menu:
To add an additional network heartbeat to the cluster, select Add
Network Heartbeat Pair.
In this example an additional connection exists between the two nodes with the
hostnames mgub01-priv and mgub02-priv respectively. These
hostnames are then used when configuring the additional heartbeat:
Click Submit to add the heartbeat.
The new heartbeat will now be displayed on the Heartbeats status page:
This completes basic cluster configuration.
RSF-1 uses broadcast packets to detect cluster nodes on
the local network. Broadcast packets are usually blocked from
traversing other networks and therefore cluster node discovery is
usually limited to the local network only. ↩