I turned on my desktop and booted to primary install … I let it sit for 10 minutes, then opened a command line … executed the usual “zypper dup”. (about 130 something updates)
Sat back on the couch, and a few minutes later, I see red text on the console and thought “download error” (not unusual).
Actually, all files for update had downloaded and 3/4 of the way thru installation, the error showed:
“unable to install … read only filesystem (root)”
I rebooted back, then ran “zypper dup” again and it realized where it left off and continued to install remaining packages, no problem.
Any suggestions on next steps ?
What should I investigate ?
Take a look at the output from dmesg - I would guess that there was an I/O error with your storage device, and the system switched it to read-only as a result, but the dmesg output will give you more info if that’s the case.
I don’t see anything “significant” in the logs.
First thing I did after booting was to scan the zypper.log file, so I could establish the time-frame of the “zypper dup” that got stumped with a install issue.
So, somewhere in that 14-15 minute timeframe, I should find something in journalctl output.
Nothing jumps out at me … I did a search for “read-only” and only found one instance
(journalctl -o short-precise -k -b -2)
May 28 21:49:51.353461 daffy kernel: Freeing unused decrypted memory: 2036K
May 28 21:49:51.353467 daffy kernel: Freeing unused kernel image (initmem) memory: 4084K
May 28 21:49:51.353472 daffy kernel: Write protecting the kernel read-only data: 30720k
May 28 21:49:51.353478 daffy kernel: Freeing unused kernel image (rodata/data gap) memory: 1840K
May 28 21:49:51.353483 daffy kernel: Run /init as init process
May 28 21:49:51.353488 daffy kernel: with arguments:
May 28 21:49:51.353493 daffy kernel: /init
May 28 21:49:51.353499 daffy kernel: with environment:
May 28 21:49:51.353504 daffy kernel: HOME=/
May 28 21:49:51.353511 daffy kernel: TERM=linux
May 28 21:49:51.353516 daffy kernel: BOOT_IMAGE=/boot/vmlinuz-6.3.2-1-default
May 28 21:49:51.353521 daffy kernel: splash=silent
Then I did a search for “error” - again, nothing significant for that timeframe.
So back in the journalctl output, I looked for the timeframe of the zypper dup and here it is - as you can see, no entries around the 21:51 timeframe - AAMOF, it jumps from 21:50:32 to 21:53:36
May 28 21:50:22.163704 daffy kernel: Bluetooth: RFCOMM TTY layer initialized
May 28 21:50:22.163725 daffy kernel: Bluetooth: RFCOMM socket layer initialized
May 28 21:50:22.163737 daffy kernel: Bluetooth: RFCOMM ver 1.11
May 28 21:50:32.815753 daffy kernel: logitech-hidpp-device 0003:046D:1025.0008: HID++ 1.0 device connected.
May 28 21:53:36.451714 daffy kernel: BTRFS info (device nvme0n1p2): using crc32c (crc32c-intel) checksum algorithm
May 28 21:53:36.451789 daffy kernel: BTRFS info (device nvme0n1p2): disk space caching is enabled
May 28 21:53:36.451812 daffy kernel: BTRFS info (device nvme0n1p2): bdev /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 670, gen 0
May 28 21:53:36.455701 daffy kernel: BTRFS info (device nvme0n1p2): enabling ssd optimizations
May 28 21:53:36.455733 daffy kernel: BTRFS info (device nvme0n1p2): auto enabling async discard
That’s really strange. While I’d have looked at dmesg, -k on journalctl should show the same messages, so that’s fine.
I’ve never seen this kind of behavior before - maybe someone else will have an idea since the logs aren’t showing anything. If the device is SMART-capable, you might run a diagnostic on the drive that holds the data and see if it’s reporting any errors. In my experience, remounting the filesystem only happens when the kernel needs to protect the data, and that’s usually when there’s a hardware issue going on.
With the root file system turned read-only saving the journal to disk will stop.
You may want to check for UNALLOCATED space as follows:
erlangen:~ # btrfs filesystem usage -T /
Overall:
Device size: 1.77TiB
Device allocated: 538.07GiB
Device unallocated: 1.25TiB
Device missing: 0.00B
Device slack: 0.00B
Used: 523.22GiB
Free (estimated): 1.26TiB (min: 649.68GiB)
Free (statfs, df): 1.26TiB
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Multiple profiles: no
Data Metadata System
Id Path single DUP DUP Unallocated Total Slack
-- -------------- --------- -------- -------- ----------- ------- -----
1 /dev/nvme0n1p2 530.01GiB 8.00GiB 64.00MiB 1.25TiB 1.77TiB -
-- -------------- --------- -------- -------- ----------- ------- -----
Total 530.01GiB 4.00GiB 32.00MiB 1.25TiB 1.77TiB 0.00B
Used
With some 1.25TiB unallocated space maintenance of infamous host erlangen is virtually hassle-free.
The backup system is fine too:
erlangen:~ # btrfs filesystem usage -T /mnt
Overall:
Device size: 48.83GiB
Device allocated: 37.07GiB
Device unallocated: 11.76GiB
Device missing: 0.00B
Device slack: 0.00B
Used: 27.96GiB
Free (estimated): 20.05GiB (min: 14.17GiB)
Free (statfs, df): 20.05GiB
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 74.75MiB (used: 0.00B)
Multiple profiles: no
Data Metadata System
Id Path single DUP DUP Unallocated Total Slack
-- -------------- -------- -------- -------- ----------- -------- -----
1 /dev/nvme0n1p3 34.01GiB 3.00GiB 64.00MiB 11.76GiB 48.83GiB -
-- -------------- -------- -------- -------- ----------- -------- -----
Total 34.01GiB 1.50GiB 32.00MiB 11.76GiB 48.83GiB 0.00B
Used 25.72GiB 1.12GiB 16.00KiB
erlangen:~ #
The Tumbleweed backup system sits on a 50 GB partition. Some 38 GB allocated space is a much higher value than expected for a Tumbleweed default installation. When in trouble always watch for unallocated space first.
Do some stress testing. While typing this post on host erlangen command stress-ng --hdd 2 --iomix 4 --vm 6 --cpu 8 runs with foreground priority causing these load factors:
erlangen:~ # w
07:03:36 up 27 min, 4 users, load average: 55.88, 54.27, 43.39
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
karl tty7 :0 06:36 26:55 26.86s 0.09s /usr/bin/startplasma-x11
karl pts/0 :0 06:36 26:40 0.00s 1.07s /usr/bin/kded5
karl pts/1 :0 06:36 25:52 6:26m 0.15s /bin/bash
karl pts/2 :0 06:37 0.00s 0.47s 0.14s /bin/bash
erlangen:~ #
Host erlangen stays fully responsive. Inexperienced users won’t even notice the high system load.
What are the load factors when running the above command on your system? Does it stay responsive?
What do you mean by “primary install” ?
Because if you mean that you started the installation snapshot i.e. the first one, it is completely normal that it is read-only, because you have to rollback to get full functionality.
That machine has 2 separate nvme drives, each drive has its own dedicated TW install.
The primary is the main/default that I use. The redundant gets updated (zypper dup) frequently, but its main goal is “in case there is a catastrophic failure of primary drive”, I can boot up and use the redundant TW (while casually recovering the primary).
Well, when I look at the “Enabled” repos view, they don’t show.
I do not see any Patterns selected for MicroOS … HOWEVER, if I click on a couple different “MicrOS” patterns (Opensuse MicroOS and MicroOS KDE Plasma Desktop) , I show packages installed
Is that incorrect ? I don’t ever recall selecting packages in MicroOS
Hi @karlmistelberger … thanks for the details … I guess I am not understanding “Unallocated”, because I show the output here, but my Conky and “df” shows about 10gb space available, but “unallocated” shows 1mb ???
Am I in trouble here with my root partition (I have a separate /home)
I have been doing quick research and articles I read suggest using “btrfs balance” ??
====== primary drive ==
:~ # btrfs filesystem usage -T /
Overall:
Device size: 30.00GiB
Device allocated: 30.00GiB
Device unallocated: 1.00MiB
Device missing: 0.00B
Device slack: 0.00B
Used: 18.78GiB
Free (estimated): 10.39GiB (min: 10.39GiB)
Free (statfs, df): 10.39GiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 55.44MiB (used: 0.00B)
Multiple profiles: no
Data Metadata System
Id Path single single single Unallocated Total Slack
-- -------------- -------- --------- -------- ----------- -------- -----
1 /dev/nvme1n1p3 28.46GiB 1.51GiB 32.00MiB 1.00MiB 30.00GiB -
-- -------------- -------- --------- -------- ----------- -------- -----
Total 28.46GiB 1.51GiB 32.00MiB 1.00MiB 30.00GiB 0.00B
Used 18.07GiB 727.78MiB 16.00KiB
:~ #
If, you have at least a Btrfs system partition, the following Timers must be enabled:
btrfs-balance.timer
btrfs-defrag.timer
btrfs-scrub.timer
btrfs-trim.timer
Assuming that, you haven’t changed the systemd Timer “OnCalendar” setting, the Btrfs housekeeping should have been executing.
You can check the time when the systemd Timer will expire with, for example, the following command:
Please be aware that, openSUSE Tumbleweed and openSUSE microOS are currently quite close to one another – on openSUSE Leap I don’t see anything related to openSUSE microOS in the YaST Software Management module …
A major difference between Tumbleweed and microOS is, the “Transactional (Atomic) Updates upon a read-only btrfs root filesystem” …
Plus, Containers …
Therefore, if, for whatever reason, you accidentally pulled anything related to microOS into you Tumbleweed system when you executed “zypper dist-upgrade” yes, a consequence may well be that, you ended up with a read-only file-system on the system partition.
@karlmistelberger has a point as you need to make your system adequate to that small partition. Balance won’t help much, as what matters is the free space. Balance just moves data around.
Do you have snapshots enabled?
$ sudo snapper list
If you do, you need to limit them, as they take some space.
btrfs balance packs data and releases allocated but unused 1 GiB chunks of disk space. Running btrfs balance start -dusage=99 / on host 6700k resulted in:
6700k:~ # btrfs filesystem usage -T /
Overall:
Device size: 59.57GiB
Device allocated: 33.05GiB
Device unallocated: 26.52GiB
Device missing: 0.00B
Device slack: 0.00B
Used: 30.66GiB
Free (estimated): 28.18GiB (min: 28.18GiB)
Free (statfs, df): 28.18GiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 82.25MiB (used: 0.00B)
Multiple profiles: no
Data Metadata System
Id Path single single single Unallocated Total Slack
-- --------- -------- -------- -------- ----------- -------- -----
1 /dev/sda8 31.01GiB 2.01GiB 32.00MiB 26.52GiB 59.57GiB -
-- --------- -------- -------- -------- ----------- -------- -----
Total 31.01GiB 2.01GiB 32.00MiB 26.52GiB 59.57GiB 0.00B
Used 29.35GiB 1.31GiB 16.00KiB
6700k:~ #
Used space is 29.35GiB . After balancing values are 31.01GiB - 29.35GiB = 1.66 GiB unused space.
@aggie reports 28.46GiB - 18.07GiB = 10.39GiB allocated but unused space. This will be reduced by running the appropriate balance start -dusage=99 /.