Can't copy large files from Win10 to Samba share

Hi,

I have a remote Leap 15.2 machine at the end of a wireguard VPN. I am trying to copy a 0.5GB file to it from a Win10 box and it fails with the useful (or rather ‘not useful’) Windows error message after 1 or 2 percent is copied:

“0X8007003B: an unexpected network error has occurred”

The remote filesystem is ext4, so I don’t think the actual size is an issue.

On this remote Leap15.2 machine I can run a Windows 7 VM (Using KVM/QEMU) and can copy to a share on that VM with no issues (the vpn connection allows me to see shares on VMs on the remote machine as long as I address them by IP address or a resolverable name). So it kind of feels like it is a config situation with samba on the Leap 15.2 box, maybe a timeout or similar.

Samba version is:

Version 4.11.14-git.247.8c858f7ee14lp152.3.19.1-SUSE-oS15.0-x86_64

And the top of the smb.conf file is as follows:

# smb.conf is the main Samba configuration file. You find a full commented# version at /usr/share/doc/packages/samba/examples/smb.conf.SUSE if the
# samba-doc package is installed.
[global]
    workgroup = BASE
    passdb backend = tdbsam
    printing = cups
    printcap name = cups
    printcap cache time = 750
    cups options = raw
    map to guest = Bad User
    logon path = \\%L\profiles\.msprofile
    logon home = \\%L\%U\.9xprofile
    logon drive = P:
    usershare allow guests = No
    ldap admin dn = 
    wins support = Yes

[Maildrop]
    comment = Maildrop
    inherit acls = Yes
    path = /Maildrop
    read only = No




Just wondered if anyone else had come across this and a fix.

Do you get something on the Leap machine by as root:

journalctl -f

And now copy that file.

Just to add a bit more info.

I can copy large files from the remote Leap15.2 machine to the Windows 10 machine with no problems. It’s the copy of large files to the Leap machine that is causing errors.

In all cases the transfers were initiated from the Windows 10 machine.

This is what I got before the file copy started and then nothing more when the copy failed:

skylab:~ # journalctl -f-- Logs begin at Fri 2021-07-30 17:07:13 BST. --
Jul 30 17:09:29 skylab nmbd[2199]: [2021/07/30 17:09:29.243555,  0] ../../source3/nmbd/nmbd_become_lmb.c:397(become_local_master_stage2)
Jul 30 17:09:29 skylab nmbd[2199]:   *****
Jul 30 17:09:29 skylab nmbd[2199]: 
Jul 30 17:09:29 skylab nmbd[2199]:   Samba name server SKYLAB is now a local master browser for workgroup BASE on subnet 192.168.27.2
Jul 30 17:09:29 skylab nmbd[2199]: 
Jul 30 17:09:29 skylab nmbd[2199]:   *****
Jul 30 17:09:29 skylab nmbd[2199]: [2021/07/30 17:09:29.243917,  0] ../../source3/nmbd/nmbd_browsesync.c:354(find_domain_master_name_query_fail)
Jul 30 17:09:29 skylab nmbd[2199]:   find_domain_master_name_query_fail:
Jul 30 17:09:29 skylab nmbd[2199]:   Unable to find the Domain Master Browser name BASE<1b> for the workgroup BASE.
Jul 30 17:09:29 skylab nmbd[2199]:   Unable to sync browse lists in this workgroup.

A minute later this came up:

Jul 30 17:13:43 skylab nmbd[2199]: [2021/07/30 17:13:43.476514,  0] ../../source3/nmbd/nmbd_become_lmb.c:397(become_local_master_stage2)Jul 30 17:13:43 skylab nmbd[2199]:   *****
Jul 30 17:13:43 skylab nmbd[2199]: 
Jul 30 17:13:43 skylab nmbd[2199]:   Samba name server SKYLAB is now a local master browser for workgroup BASE on subnet 172.17.0.1
Jul 30 17:13:43 skylab nmbd[2199]: 
Jul 30 17:13:43 skylab nmbd[2199]:   *****
Jul 30 17:13:43 skylab nmbd[2199]: [2021/07/30 17:13:43.476799,  0] ../../source3/nmbd/nmbd_browsesync.c:354(find_domain_master_name_query_fail)
Jul 30 17:13:43 skylab nmbd[2199]:   find_domain_master_name_query_fail:
Jul 30 17:13:43 skylab nmbd[2199]:   Unable to find the Domain Master Browser name BASE<1b> for the workgroup BASE.
Jul 30 17:13:43 skylab nmbd[2199]:   Unable to sync browse lists in this workgroup.
Jul 30 17:13:43 skylab nmbd[2199]: [2021/07/30 17:13:43.476857,  0] ../../source3/nmbd/nmbd_browsesync.c:354(find_domain_master_name_query_fail)
Jul 30 17:13:43 skylab nmbd[2199]:   find_domain_master_name_query_fail:
Jul 30 17:13:43 skylab nmbd[2199]:   Unable to find the Domain Master Browser name BASE<1b> for the workgroup BASE.
Jul 30 17:13:43 skylab nmbd[2199]:   Unable to sync browse lists in this workgroup

This all appears to be wins nameserver related. For reference:

192.168.27.0/24 is the remote network that the Leap box sits on.
172.17.0.1 is the virtual interface for a container so nothing to do with this I hope.
The wireguard vpn interface is masqueraded to allow access to all of the remote network so I suspect nmbd is not attempting to register itself on that interface.

FWIW, a similar thread I recall…
https://www.linuxquestions.org/questions/linux-networking-3/problem-with-samba-large-files-copy-win10-to-ubuntu-4175677436/

There are some links in post#2 that may be of interest. That’s about all I can offer.

In particular, this may be of interest (focussing on the Windows client)…
https://answers.microsoft.com/en-us/windows/forum/all/large-file-transfer-failure-error-0x8007003b/1946f3b3-779c-46f4-8329-2f7154ace974Edit: Some users have found this behaviour occurring due to Windows firewalls (or similar third-party software installed).

@deano_ferrari Thanks for the links, sadly nothing worked with the firewall or changing the smb timeout.

One strange thing I ahve seen though and maybe a clue. I can copy from another linux box to the remote Leap 15.2 box but when I do that and monitor the file size on the remote machine from my Win10 box it increments in real time and I can see the file size increasing. When I do the copy from Windows the file created immediately shows the final size. I don’t know what to make of it, just wondering if creating the large file size causes something to time out.

Is using WinSCP a viable option for you?

It might also be worth examining the Windows Event Viewer for any potential network-related issues perhaps…
https://docs.microsoft.com/en-us/host-integration-server/core/windows-event-viewer1

Nothing in event viewer that I can see. Also nothing useful in the logs.

Interestingly when the transfer stalls at about 1 or 2% but before time out, any other network traffic to the remote box (like an ssh session) is also blocked. So It feels like an underlying network issue as opposed to a SMB related issue. Not sure how to test that, any suggestions?

I would try if increasing the samba log level gives more relevant details.

If you want to do a deep-dive there is Wireshark.

Capture the traffic at the receiving side and/or transmitting side and ignore everything apart from just before the stall happens.
Although it will not give exact information it can give pointers on where to look at but it requires insight in the protocols.

Managed to move the remote machine onto my local network (long cable!) and so removed WireGuard from the configuration and all was well. So It seems to be a wireguard issue … but only effecting a samba server and not a win7 server :\

Searching “wiregaurd” and “samba” turns up a lot of similar reports.

Resorted to NFS to see if that would work any better but sadly not. Again the whole link gets blocked and then it times out.

MTU problem?

https://keremerkan.net/posts/wireguard-mtu-fixes/
Just the first search hit …

That’s exactly where I am now …

Irrespective of the file size it stalls after ~7MB. Is that a magic number or just a random buffer size?