SSH Binding
Using SSH to bind two nodes together
A shared nothing cluster operates by creating incremental snapshots of data sets and then synchronising them between cluster nodes using ZFS send/receive over an ssh
tunnel.
The ssh
tunnel created and used by the synchronisation process needs to be passwordless and therefore the two nodes need to be ssh-bound. To configure ssh binding perform the following steps on each node:
-
Create your ssh keys as the root user (press return to accept the defaults for all prompts):
# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa Your public key has been saved in /root/.ssh/id_rsa.pub The key fingerprint is: SHA256:2dGrTFvaGz8QJbVeVGS5sFv/deJRngPvSOr6v1SaMXc root@NodeA The key's randomart image is: +---[RSA 3072]----+ | ...B| | ....= | | . .o+ o| | o ..= +.| | S o o+*=E| | o *.oX+*| | = =*oo=| | ..=o..| | .+oooo. | +----[SHA256]-----+
-
Once
ssh-keygen
has been run, a public key is saved to/root/.ssh/id_rsa.pub
. This public key now needs to be added to the file/root/.ssh/authorized_keys
on the other node (if theauthorized_keys
file does not exist simply create one). -
Manually ssh
NodeA > NodeB
thenNodeB > NodeA
and accept the prompt to add each machine to the list of known hosts:root@NodeA:~# ssh root@NodeB The authenticity of host 'NodeB (10.10.10.2)' can't be established. ED25519 key fingerprint is SHA256:EDmzS45TqKabZ53/35vXb4YyKTQuzJxNnbFuIwFj9UU. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added 'NodeB,10.10.10.2' (ED25519) to the list of known hosts. Last login: Tue Sep 12 09:54:49 2023 from 10.10.10.1 Oracle Solaris 11.4.42.111.0 Assembled December 2021 root@NodeB:~#
Once this process has been completed you should be able to ssh between nodes without being prompted for a password.
SSH login between nodes taking a long time
If ssh is taking a long time, try running ssh -v
to see any errors
that may be causing the delay. A common issue is with GSS/Kerberos:
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure. Minor code may provide more information
Credentials cache file '/tmp/krb5cc_1000' not found
debug1: Unspecified GSS failure. Minor code may provide more information
Credentials cache file '/tmp/krb5cc_1000' not found
This can be disabled on the nodes by modifying /etc/ssh/ssh_config
and
disabling all Host
options that begin with GSS
. For example:
Host *
# ForwardAgent no
# ForwardX11 no
# ForwardX11Trusted yes
# PasswordAuthentication yes
# HostbasedAuthentication no
GSSAPIAuthentication no
GSSAPIDelegateCredentials no
GSSAPIKeyExchange no
GSSAPITrustDNS no
# BatchMode no
# CheckHostIP yes
# AddressFamily any
# ConnectTimeout 0
# StrictHostKeyChecking ask
# IdentityFile ~/.ssh/id_rsa
# IdentityFile ~/.ssh/id_dsa
# IdentityFile ~/.ssh/id_ecdsa
# IdentityFile ~/.ssh/id_ed25519
# Port 22
# Ciphers aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc
# EscapeChar ~
# Tunnel no
# TunnelDevice any:any
# PermitLocalCommand no
# VisualHostKey no
# ProxyCommand ssh -q -W %h:%p gateway.example.com
# RekeyLimit 1G 1h
# UserKnownHostsFile ~/.ssh/known_hosts.d/%k
SendEnv LANG LC_*