"Disappearing" filesystems

Hey there -

Dual-boot Lenovo type 20EV (business notebook with zero pizazz but a great keyboard) and Windows 10/Leap 15.1 (all partitions ext4) checking in with a weird one:

I didn’t go with btrfs because I’m not a masochist and (last I knew) ext2/3/4 just work, but now I’m beginning to wonder if I should swallow my ethics and go with ReiserFS?

When using the system - or sometimes when it is idle - ALL the files will just “go away.” If some process is running under X and tries to use the disk, I might get a last-gasp message that an operation could not be completed because the filesystem is read-only, then nothing.

If perchance a terminal is open, I get to see first hand such nuggets as:

  • if I type ls, I get “bash: /usr/bin/ls: No such file or directory”
  • if I type “echo *”, I get an * (I don’t like that AT ALL)
  • if I cd to a real directory, it cds and pwd works as expected
  • if I try to cd to a protected directory such as /root, it refuses
  • if I try to cd to a non-existent directory, it fails
  • “echo ‘blah’ > file” fails read only

No command/file is “there,” but the directory structure appears to be intact; however, “echo *” should show at least any subdirectories present wherever I am in the filesystem.

Nothing useful in /var/log; here’s a snippet of a borked messages, showing that things went south lickety split:

019-11-13T06:36:36.891506-06:00 nomad systemd[1]: Reached target Multi-User System.
2019-11-13T06:36:36.892608-06:00 nomad systemd[1]: Reached target Graphical Interface.
2019-11-13T06:36:36.894712-06:00 nomad systemd[1]: Starting Update UTMP about System Runlevel Changes…
2019-11-13T06:36:36.904228-06:00 nomad cron[1711]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 5% if used.)
2019-11-13T06:36:36.908241-06:00 nomad cron[1711]: (CRON) INFO (running with inotify support)
2019-11-13T06:36:36.919884-06:00 nomad systemd[1]: Started Update UTMP about System Runlevel Changes.
2019-11-13T06:36:36.920989-06:00 nomad systemd[1]: Startup finished in 4.055s (kernel) + 6.582s (initrd) + 16.560s (userspace) = 27.197s.
2019-11-13T06:36:39.064205-06:00 nomad nscd: 1134 checking for monitored file `/etc/resolv.conf’: No such file or directory
nomad systemd[1]: systemd 234 running in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 -IDN default-hierarchy=hybrid)
2019-11-13T07:31:38.318475-06:00 nomad systemd[1]: Detected architecture x86-64.
2019-11-13T07:31:38.318494-06:00 nomad systemd[1]: nss-lookup.target: Dependency Before=nss-lookup.target dropped
2019-11-13T07:31:38.317202-06:00 nomad kernel: 0.000000] microcode: microcode updated early to revision 0xcc, date = 2019-04-01
2019-11-13T07:31:38.318505-06:00 nomad kernel: 0.000000] Linux version 4.12.14-lp151.28.25-default (geeko@buildhost) (gcc version 7.4.1 20190905 [gcc-7-branch revision 275407] (SUSE Linux) ) #1 SMP Wed Oct 30 08:39:59 UTC 2019 (54d7657)
2019-11-13T07:31:38.318511-06:00 nomad kernel: 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.12.14-lp151.28.25-default root=UUID=15f58bef-a605-4638-8fab-3192e8c76671 splash=silent resume=/dev/disk/by-id/ata-MKNSSDRE240GB-3D_MK1704111003545DF-part6 mitigations=auto quiet
2019-11-13T07:31:38.318522-06:00 nomad kernel: 0.000000] x86/fpu: Supporting XSAVE feature 0x001: ‘x87 floating point registers’
2019-11-13T07:31:38.318498-06:00 nomad systemd[1]: Starting Flush Journal to Persistent Storage…
2019-11-13T07:31:38.318528-06:00 nomad kernel: 0.000000] x86/fpu: Supporting XSAVE feature 0x002: ‘SSE registers’
2019-11-13T07:31:38.318530-06:00 nomad kernel: 0.000000] x86/fpu: Supporting XSAVE feature 0x004: ‘AVX registers’

The long string of nulls is where things died; the file resumes following a restart; the successful whining about resolv.conf just before the bazillion ^@s implies that /var still worked for an instant after / went 'bye 'bye. However, /etc/resolv.conf points to a file under /var. so it must’ve failed to find the link while the file was still “there.”

This might happen a few seconds or a few hours into a session; it pretty much always happens. There’s no correlation I can find between what I’m doing and the crashes; CPU usage, a particular operation or what have you.

Windows 10 is as stable as reliable as an anvil, so I do nit suspect a hardware problem.

Help?

Changing to ReiserFS is unlikely to solve your problem.

In my experience, when there is a hardware related fault the file system gets remounted as read-only. That might be what is happening.

Perhaps your disk is spinning down after being idle for a while, and is giving errors when attempting to spin up again. (I’m just guessing there).

Nrickert is right that you are describing symptoms of a root filesystem that is readonly. Your 4.12.14-lp151.28.25 isn’t the newest kernel. Maybe updating would help.

Looking through the kernel messages with »dmesg« or »journalctl« (which also incorporates kernel messages) could be worthwhile here. For a list of suspicious lines, here’s what I like to do:

rig:~/Desktop ▶ **journalctl -b** --no-hostname --output=short-precise | grep -Ei '\S*( no |warn|\Werr|fail|conflict|igno|repeat|n^a-n p-z]t )\S*\s*\S*'
Nov 24 09:10:55.550805 kernel: No NUMA configuration found
Nov 24 09:10:55.551672 kernel: x2apic: IRQ remapping doesn't support X2APIC mode
Nov 24 09:10:55.551971 kernel: ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
Nov 24 09:10:55.552011 kernel: pmd_set_huge: Cannot satisfy [mem 0xf8000000-0xf8200000] with a huge-page mapping due to MTRR override.
Nov 24 09:10:55.552028 kernel: core: PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off
Nov 24 09:10:55.552105 kernel: ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
Nov 24 09:10:55.552373 kernel: acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug PME]
Nov 24 09:10:55.557023 kernel: system 00:06: [mem 0xfee00000-0xfeefffff] could not be reserved
Nov 24 09:10:55.558297 kernel: GHES: HEST is not enabled!
Nov 24 09:10:55.558334 kernel: i8042: PNP: No PS/2 controller found.
Nov 24 09:10:55.558620 kernel: ima: No TPM chip found, activating TPM-bypass! (rc=-19)
Nov 24 09:10:55.558757 kernel: PM: Hibernation image not present or could not be loaded.
Nov 24 09:10:55.631669 kernel: ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
Nov 24 09:10:55.645980 systemd-udevd[243]: link_config: Cannot get device settings for lo : Operation not supported
Nov 24 09:10:55.656696 kernel: ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
Nov 24 09:10:55.681372 kernel: xhci_hcd 0000:00:14.0: cache line size of 64 is not supported
Nov 24 09:10:55.997247 kernel: ata1.00: supports DRM functions and may not be fully accessible
Nov 24 09:10:55.997317 kernel: ata1.00: supports DRM functions and may not be fully accessible
Nov 24 09:10:56.000211 kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov 24 09:10:56.417868 systemd-udevd[440]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov 24 09:10:56.524123 systemd[1]: display-manager.service: PID file /var/run/displaymanager.pid not readable (yet?) after start: No such file or directory
Nov 24 09:10:56.534315 kdm[507]: plymouth is NOT running
Nov 24 09:10:56.600199 systemd-udevd[433]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Nov 24 09:10:57.169854 mtp-probe[534]: bus: 3, device: 4 was not an MTP device
Nov 24 09:10:57.169866 mtp-probe[533]: bus: 3, device: 3 was not an MTP device
Nov 24 09:10:57.569328 kernel: random: 7 urandom warning(s) missed due to ratelimiting
Nov 24 09:11:02.465882 kuiserver5[819]: Locale not supported by C library.
Nov 24 09:42:45.423528 kwalletd5[1136]: Locale not supported by C library.
Nov 24 09:42:45.563373 dbus-daemon[586]: [session uid=1000 pid=586] Activated service 'org.kde.kwalletd' failed: Failed to execute program org.kde.kwalletd: No such file or directory
Nov 24 10:51:56.229329 vlc[1541]: QObject::~QObject: Timers cannot be stopped from another thread
rig:~/Desktop ▶ _

(One of my whimsical side-hobbies around Linux includes making it boot as quick as possible with as little potential for problems as possible; maybe this helps closing in on the problem you have, mayor_snorkum, be it hardware- or software-related.)

Joking about ReiserFS; SSD, so no spin-down.

Will leave an exerciser running for a couple days, though.

Does your system settle down?

erlangen:~ # journalctl -b -u systemd-udev-settle.service
-- Logs begin at Wed 2019-11-20 07:20:21 CET, end at Thu 2019-11-28 09:00:31 CET. --
Nov 27 16:48:55 erlangen udevadm[541]: systemd-udev-settle.service is deprecated.
Nov 27 16:48:56 erlangen systemd[1]: Started udev Wait for Complete Device Initialization.
erlangen:~ #