I posted about this over on the openSUSE Reddit but I tried again today and getting similar results. I’ll outline what happened (so far) and then update this as I go from my phone because I know once I reboot, the only way to get back into a working system is to go to Maintenance Mode and use snapper to undo the update.
Perform a zypper dup
nothing out of the ordinary. 1. Wait a few minutes, attempt to use sudo for something, password will fail.
Try opening YaST and it’ll complain user ‘root’ not found
. 1. Reboot.
Startup will commence but fail at switch root (as far as I can tell).
To recover:
Boot into Maintenance Mode.
**snapper list **
(get the numbers) 1. snapper -v undochange xx…xx
Reboot
[LIST=1]
The default option will no longer work, must go to advanced and select the previous kernel.
[/LIST]
System Information:Memory: 8GB
Processor: Intel Core i7-3687U
Graphics: Intel Ivybridge Mobile
If there’s any other information I can provide, please let me know. I thank anybody in advance for your help.
Some details please.
What version of Tumbleweed are you running (/etc/os-release), what version (or date) was your unsuccessful ‘zypper dup’. what versions of ‘ucode-intel’ and ‘kernel-firmware’ are installed and what repositories do you have enabled, also what is the make and model of the machine?
No problem. Thanks for helping me dig in to what I’ve got going on here.
Repos:
~> zypper lr -d -E
Repository priorities are without effect. All enabled repositories share the same priority.
# | Alias | Name | Enabled | GPG Check | Refresh | Priority | Type | URI | Service
--+--------------------+-----------------------------+---------+-----------+---------+----------+--------+-------------------------------------------------------+--------
1 | Visual Studio Code | Visual Studio Code | Yes | (r ) Yes | No | 99 | rpm-md | https://packages.microsoft.com/yumrepos/vscode |
4 | repo-non-oss | openSUSE-Tumbleweed-Non-Oss | Yes | (r ) Yes | Yes | 99 | yast2 | http://download.opensuse.org/tumbleweed/repo/non-oss/ |
5 | repo-oss | openSUSE-Tumbleweed-Oss | Yes | (r ) Yes | Yes | 99 | yast2 | http://download.opensuse.org/tumbleweed/repo/oss/ |
7 | repo-update | openSUSE-Tumbleweed-Update | Yes | (r ) Yes | Yes | 99 | rpm-md | http://download.opensuse.org/update/tumbleweed/ |
OS Version:
This is trickier. I use TW as my daily driver and since I can easily backout this change, I’ve rolled back via Snapper and am on my last known working state. Right now, my os-release reads as below, but the problem was reproducible on 20180127 and is still reproducible on 20180128:
If at any point you need me to return to full broken state to get information, just give me about 10-30 minutes (depending on if I’m at home or work; vastly different internet connections).
Do you maybe have installation of recommended packages disabled?
Though I don’t think that’s the problem, otherwise there would have been a few more reports of this problem (judging from the Mesa-dri “issue”).
Btw, “switching root” has nothing to do with “user root”, the former is when the system is switched to the installed / partition from the initrd.
If switching to root fails, somehow the “driver” for the / partition may be missing in the initrd.
So, any special partion setup, like LVM or the like?
Wouldn’t explain the missing “user root” though…
Grasping at straws, do you have a separate /boot partition that might be full?
But then you shouldn’t even be able to use snapshots/revert to a previous snapshot, unless that has changed recently…
Maybe your / partition is too full?
I belive I am possibly suffering from the same. After a recent update in Tumbleweed, the system boots to Grub, then loads the initramdisk, but just after kernel spits out ‘switching root’ it hangs, then it tries to mount root and all other btrfs subvolumes but times out and offers rescue mode. In journalctl there is a line that initializing according to udev database failed. Trying to manually mount the subvolumes hangs the system forever.
I can, however, boot a live disk and mount the subvolumes from there. They report no errors in the logs, scrub turns out fine and everything is fine. No hardware failure according to SMART. To me it looks like the automounting of the partition/subvolume just fails after Grub. And Grub, I believe, gets the location of root from the parameters that it hold in its own config or from the kernel when it is compiled in there. But I don’t know what happens after that. Is it that systemd automounts the devices with the help of udev"?
I have not explicitly defined that in {zypp,zypper}.conf so it is using Zypper defaults, which I believe is to install recommended packages.
It is an encrypted installation (LVM+LUKS) with btrfs, set up through the Tumbleweed installer.
Yeah, that one is really weird to me. Never seen that happen before.
The /boot is separate but it isn’t close to being full, at least not looking at it through df.
~> df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 403M 96M 264M 27% /boot
Only 50% showing but that’s because df is only seeing 40GB even though I extended the lv to the rest of the disk and Partitioner shows the rest of the disk. I’m guessing there’s some btrfs magic I need to do to expand to the rest of the LV, though. Just haven’t go around to it.
~> df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/system-root 40G 25G 16G 62% /
~> sudo lvdisplay
[sudo] password for root:
--- Logical volume ---
LV Path /dev/system/root
LV Name root
VG Name system
LV UUID xcoUgu-CkZp-z1QB-HICy-0QSZ-BQMc-uFeXRZ
LV Write Access read/write
LV Creation host, time voyager, 2017-12-27 14:23:38 -0700
LV Status available
# open 1
LV Size 468.34 GiB
Current LE 119896
Segments 2
Allocation inherit
Read ahead sectors auto
- currently set to 1024
Block device 254:1
Thanks for hopping on, wolfi. I’ve read enough on here to be worried when you throw out things like “grasping at straws” and the confusion on missing root bit. I just assume you’ve seen everything.
I did immediately after I posted. I tried to update my post but was past my 10 minute limit. Just a stupid step I forgot, that’s all. Now it’s all good.
~> df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/system-root 469G 27G 442G 6% /
Based on what pylkko saw in the journal (udev), I held back udev, which systemd needed, so I held back those, too. The packages I held back (since they were dependent on each other) were
Yeah it’s a really anal timeout since this is not a financial board. They could set it to 2 days without enabling submarine spammers to change posts months later.
And changing your post after it may have been read and answered to might create confusing threads. That is why, when you want to add additional information, you simply add a new post.
The five minutes is for those hasty ones that send off their posts before having checked it for typos. I assume it is clear that it is better to first think about how you best post your information before typing, then type, then check, then rethink and then when you are convinced that this is a useful and understandable post click on the Post Message button.
Right.
I just asked because adding system users has been split out into several packages which need to be installed to get that user added to the system. Not installing recommended packages might cause some users “missing”, but OTOH the necessary packages probably are required anyway.
Not really relevant for root anyway though, I think.
It is an encrypted installation (LVM+LUKS) with btrfs, set up through the Tumbleweed installer.
Ok, so probably something goes wrong when mounting it.
Is it accessible when you break the boot in the initrd via the rd.break boot option? (on /sysroot)
The /boot is separate but it isn’t close to being full, at least not looking at it through df.
Ok.
Still I would try to recreate the initrd, with “sudo mkinitrd”.
Thanks for hopping on, wolfi. I’ve read enough on here to be worried when you throw out things like “grasping at straws” and the confusion on missing root bit. I just assume you’ve seen everything.
Not really…
And I have to admit that I nearly have no experience with LVMs.
Doesn’t really sound like it would cause your problem, maybe it’s more related to updating udev or systemd per se.
Maybe the LVM somewhat stops working during the update (could explain the missing root user), and therefore not all changes can be written (e.g. the new initrd may be corrupted/truncated) or something.
That is just guessing though, maybe the best course of action would be a bug report (if it is reproducible).
I agree that it doesn’t really sound like it but I can’t argue with the results, either. I think what you mention below feels quite plausible.
This sounds/feels like it could be a thing. I’ve honestly never seen an update break root in this way in my time using linux.
I do have a bug report I filed but I filed it under Maintenance because the error was presenting during upgrade. If we’re talking udev/systemd, perhaps I should change it to Basesystem.
In the openSUSE Forums Terms and Conditions (that you should have read before you signed up as a forum member and that you can revisit by clicking on the link at the bottom of almost every page in these forums), it says:
Once you submit your post in these forums, you have 10 minutes to edit it. After 10 minutes, if you need to, you should post a reply with any corrective information. Why? Two reasons. First, the NNTP protocol doesn’t support editing of posts and so edits won’t transit our gateway unless done before the cron job runs the gateway to sync the messages. Second, editing a post after it has replies could invalidate those replies because the original information changed. Posting a follow up reply with additional/changed information allows any previous reply to stay in context.
And like wolfi323 already said, when you have questions about the forums, there is a sub-forum for that. So please for each subject a different thread in the correct sub-forum. Another way to keep threads understandable and efficient.