I have a couple of OpenSuSe machines which are communicating with each other over a WAN connection. There are regular ssh and rsync connections between them which run without any problem when the netwrok is OK. However when a network problem arises sometimes the ssh connections hang around indefinitely even though they have timeouts of 30 seconds specified. They also don’t always clear when the network comes back.
Is there anything I can do to ensure they clear other than having to kill them?
ServerAliveCountMax
Sets the number of server alive messages (see below) which may be sent without ssh(1) receiving any messages back from the server. If this
threshold is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session. It is impor-
tant to note that the use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through
the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive
mechanism is valuable when the client or server depend on knowing when a connection has become inactive.
The default value is 3. If, for example, ServerAliveInterval (see below) is set to 15 and ServerAliveCountMax is left at the default, if
the server becomes unresponsive, ssh will disconnect after approximately 45 seconds. This option applies to protocol version 2 only.
ServerAliveInterval
Sets a timeout interval in seconds after which if no data has been received from the server, ssh(1) will send a message through the
encrypted channel to request a response from the server. The default is 0, indicating that these messages will not be sent to the server.
This option applies to protocol version 2 only.
As it happens the network did have problems again yesterday and it appears that using ServerAlive may have worked. Difficult to know for sure as it all happened when I wasn’t around to see exactly what was happening but as it all settled down quitely it suggests that the ssh connections did drop out.
If the SSH command restarted you should be able to tell by its start time
(ps aux output) if things are refreshed. Even if not, if you look at
the ports in use before/after a drop the chances of the high port used
(from the client to the server) being the same both times are pretty
stinkin’ small… so check those by running the following before/after,
maybe also grepping for the server IP (from the client side) or the client
IP (from the server side):