I have just configured the client to go to sleep (suspend to ram) after 10 minutes at inactivity, sometimes when resuming from inactivity the sshfs still works, other times it doesn’t. I think is relatet to the time it takes to resume from sleep, but i can’t find where is this timeout set.
In the server the sshd_config is
# $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $
# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.
# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin
# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.
#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::
#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key
# Ciphers and keying
#RekeyLimit default none
# Logging
#SyslogFacility AUTH
#LogLevel INFO
# Authentication:
#LoginGraceTime 2m
#PermitRootLogin no
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10
#PubkeyAuthentication yes
# The default is to check both .ssh/authorized_keys and .ssh/authorized_keys2
# but this is overridden so installations will only check .ssh/authorized_keys
AuthorizedKeysFile .ssh/authorized_keys
#AuthorizedPrincipalsFile none
#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody
# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes
# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no
# Change to no to disable s/key passwords
#ChallengeResponseAuthentication yes
# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no
# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no
# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes
#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
#PrintMotd yes
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none
# no default banner path
#Banner none
# override default of no subsystems
Subsystem sftp /usr/lib/ssh/sftp-server
# This enables accepting locale enviroment variables LC_* LANG, see sshd_config(5).
AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
**AlphaTauri:~ #**
ServerAliveInterval: number of seconds that the client will wait before sending a null packet to the server (to keep the connection alive). ClientAliveInterval: number of seconds that the server will wait before sending a null packet to the client (to keep the connection alive).
Setting a value of 0 (the default) will disable these features so your connection could drop if it is idle for too long.
… and it seems that is what is happening, but what is “too long”? I can’t find it.
The issue is suspending the machine in the first place. TCP isn’t designed for that and hence many devices in between following the standards just break the connection. That’s what the timeout is for: To detect a “broken” connection and free up system resources. The moment you suspend the client it’s no longer able to respond to the keep alive requests from the server. So the server thinks the client became unresponsive (which is true as the client is in suspension/hibernation and therefor really is unresponsive) and hence resets the broken connection. When the client is then woken up again it restores it’s state to as if it was never suspended/hibernated in the first place and tries to send data to the supposed to be still open connection. As the server has already closed it the client gets a RST (reset) and either “fails” or tries to re-establish the connection.
I’m not sure if this can be done or how - but a “proper” way would be to disconnect before going to sleep and reconnect after wake up. There’re several anologies: A simple phone call for example: noone would keep the call while sleeping - but rather would hang up before going to sleep and redial after waking up again. If so the other side most likely will hang up in between - and that’s exactly what’s happen here.
TCP isn’t a magic black box that somehow only has to get connected once and can then be used until active disconnect. The connection has to be checked regular if it’s still working. If not it’s assumed broken and reset.
If you want your connection to stay alive - just set your power settings to not let the system go down into suspend or hibernate. Otherwise try to implement a proper disconnect before suspension and reconnect after waking up.
I understand what you say but I was just trying it, thinking that after sleep/resume the connection will be broken, but that’s not the case, I sleep/resume and the connection is still alive and works fine. The connection die only if the client is sleeping a lot of time, not sure the amount of time, but I think at least 2 hours. So the question is where can I configure timeout length? because setting it to 24h, for instance will work for me.
ClientAliveCountMax
Sets the number of client alive messages which may be sent without sshd(8) receiving any messages back from
the client. If this threshold is reached while client alive messages are being sent, sshd will disconnect
the client, terminating the session. It is important to note that the use of client alive messages is very
different from TCPKeepAlive. The client alive messages are sent through the encrypted channel and therefore
will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The client alive
mechanism is valuable when the client or server depend on knowing when a connection has become unresponsive.
The default value is 3. If ClientAliveInterval is set to 15, and ClientAliveCountMax is left at the
default, unresponsive SSH clients will be disconnected after approximately 45 seconds. Setting a zero
ClientAliveCountMax disables connection termination.
ClientAliveInterval
Sets a timeout interval in seconds after which if no data has been received from the client, sshd(8) will
send a message through the encrypted channel to request a response from the client. The default is 0, indi-
cating that these messages will not be sent to the client.
if client was suspended when server was transmitting data, server will close connection after TCP re-transmission attempts run out (15 by default on Linux).
there is TCP level timeout that is enabled by default (TCPKeepAlive). This is independent of application level keep alives. This means slightly more than 2 hours with default Linux settings.
if client is behind NAT, connection tracking timeout is likely much shorter than anything else and connection will be dropped.
This really must be supported in application itself. I.e. sshfs must be able to transparently establish new connection and continue.
While they can be set on the fly with sysctl command (man sysctl for more info), for permanent configuration add the required parameter(s) to /etc/sysctl.conf or create something like /etc/sysctl.d/95-custom.conf and add there. (Take care to understand what they do first as there may be unintended impacts on the system.)
When NFS is a possibility I do not understand why you considered something else.
And when you are going to use NFS, I would recommend to use the sytemd automount feature in the fstab entry/ies of your client. This will mount when needed and umount when not needed after a timeout. That alone would already considerably avoid broken connections.
Well I configured it in order to be able of use it from outside home and from inside and I use the same method, but in fact when I use it from outside I just use it some minutes while when I use it at home I have it mounted all day, so I think you’re right and It will be better to use different methods.
Kernel defines when TCP keep alive probes start and how many times they are retried. Whether it is used at all is per-socket option SO_KEEPALIVE which is set by TCPKeepAlive.
To avoid any timeout one at least needs to disable TCPKeepAlive (on both sides), ClientAliveInterval, ServerAliveInterval and firewall on both sides. If there is any NAT in between, it will likely timeout anyway.
It’s not me to judge - but tinker with timeouts meant to detect a fault just cause you seem to prefer suspension/hibernation rather than clean shutdown sounds quite wrong to me.