ssh hangs when there is a network issue

I have a couple of OpenSuSe machines which are communicating with each other over a WAN connection. There are regular ssh and rsync connections between them which run without any problem when the netwrok is OK. However when a network problem arises sometimes the ssh connections hang around indefinitely even though they have timeouts of 30 seconds specified. They also don’t always clear when the network comes back.

Is there anything I can do to ensure they clear other than having to kill them?

> connections hang around indefinitely even though they have timeouts of
> 30 seconds specified.

How/Where did you set this timeout?

Good luck.

This is from


man ssh_config

I cut out the part I think should help you.




     ServerAliveCountMax
             Sets the number of server alive messages (see below) which may be sent without ssh(1) receiving any messages back from the server.  If this
             threshold is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session.  It is impor-
             tant to note that the use of server alive messages is very different from TCPKeepAlive (below).  The server alive messages are sent through
             the encrypted channel and therefore will not be spoofable.  The TCP keepalive option enabled by TCPKeepAlive is spoofable.  The server alive
             mechanism is valuable when the client or server depend on knowing when a connection has become inactive.


             The default value is 3.  If, for example, ServerAliveInterval (see below) is set to 15 and ServerAliveCountMax is left at the default, if
             the server becomes unresponsive, ssh will disconnect after approximately 45 seconds.  This option applies to protocol version 2 only.


     ServerAliveInterval
             Sets a timeout interval in seconds after which if no data has been received from the server, ssh(1) will send a message through the
             encrypted channel to request a response from the server.  The default is 0, indicating that these messages will not be sent to the server.
             This option applies to protocol version 2 only.

I set the timeout with the ConnectTimeout option.

Thanks for the tip on the ServerAlive settings, it looks as though this may be what I need to add to the commands.

Unfortunately, because the problem doesn’t arise very often, it might be some time before I can report progress.

As it happens the network did have problems again yesterday and it appears that using ServerAlive may have worked. Difficult to know for sure as it all happened when I wasn’t around to see exactly what was happening but as it all settled down quitely it suggests that the ssh connections did drop out.

Thanks for the tip.

If the SSH command restarted you should be able to tell by its start time
(ps aux output) if things are refreshed. Even if not, if you look at
the ports in use before/after a drop the chances of the high port used
(from the client to the server) being the same both times are pretty
stinkin’ small… so check those by running the following before/after,
maybe also grepping for the server IP (from the client side) or the client
IP (from the server side):

Code:

/usr/sbin/ss -planeto | grep :22 | grep ESTAB

Good luck.