13.1 KDE Completely Broken After Update to 3.11.10-7

You personal settings for all apps are all in your home. keep that and you have most of it. If you have things like web servers or database servers you might want to copy /etc which is where most of those settings are and also any database you might want to perverse This is in any case. An update should bring you back depends on if there are any corruptted system configs. If so then a new install is called for.

IMO it is not worth the time and possible headaches to try and preserve your programs it is dead easy to reinstall and if you forget some you probably do not need them anyhow LOL
.

For sure, something fundamental is badly corrupted. I guess I was thinking the “update” vs. “clean install” might accomplish my goal while retaining as many customizations (fonts, desktop, etc.) as possible. Worst case, the system still won’t boot to a stable desktop and I’m just back to the full re-install, right?

I backed up the entire /root directory system to a separate drive by booting Knoppix from the DVD last night, so all my files are saved, if they can be of some value afterwards.

Ok, so I decided to try the “update” vs. “install” path, assuming that I was already at worst case, so why not? Lo and behold, it seems to have worked! I will have to test more, but all my settings seem to have been preserved and the system is showing none of the strange behavior that I started this thread with.

As part of the update, I only selected core repos, so maybe something went bad in the last update from one of the other repos. Will continue to test and see…but glad to have my SUSE desktop back up and running.

Thanks again to all who provided insight and guidance!

my goal while retaining as many customizations (fonts, desktop, etc.)

All these things are personal settings and are in home. So if you ever do a New Install simply be sure you don’t format the home partition and mount it as /home. All your stuff will then be saved. A new install will format root (/) thus erasing any system wide configs and any databases that may be on / So if you need to save these things you must back then up before a new install. An update simply writes the files to root but does not format it. Thus most system wide configs are reused

Ok, I narrowed down what is causing the problem, though I don’t know why. After doing a clean re-install of the root partition, I began adding back the things that were missing or changed as a result. Among these were some optimizations I made for the SSD, based on info found here: (https://sites.google.com/site/easylinuxtipsproject/ssd-in-opensuse). Once I made these changes, the system was completely broken again after a reboot. Amazing that the desktop comes up at all, considering how much won’t work at a fundamental level.

It turns out, it is either (or both) adding the “noatime” flag to the root partition line in the /etc/fstab file and/or adding the fstrim -v commands for root and home in the /etc/rc.d/boot.local file. The only way I managed to fix this the second time around was to boot the system into Mint (the disk I had lying around) and modify the fstab file to remove noative and delete the boot.local file (I could not figure out how to modify and replace this, due to permission issues, I guess - still a bit of a Linux novice…). After changing these files back, the system boots normally again, no problems.

Now, these changes had been in those files for a long time, and the system had been completely stable, so something else must have changed during a recent update that made those commands become catastrophic to openSUSE. Maybe someone knows what’s going on?

Also, if anyone can help me get the original boot.local file (which I have saved in my home directory) back into /etc/rc.d folder, that would be helpful.

trim is only executed at boot if in /etc/rc.d/boot.local so I don’t see how that could conflict with noatime. Are you sure that you entered the option correctly?.

Don’t tell us show us the fstab file and the lines that did not work

that is a relatively important optimization and if it were broken there would be lots of others seeing the problem. Every time a file is opened a datetime field is updated which greatly increases the number of writes and thus reduce the overall life of the drive. noatime stops this thus increasing the over all life of the SSD.

Yes, the reason I went to the trouble of making those mods was to increase the life of the SSD. I have done similar modifications throughout Win7 for my SSDs.

All I can tell you is that the re-installed system was working perfectly normal right up until the moment I changed those two files, then went back to the broken state reported in the original post. After changing those files back to their original state (in one case, deleting it), the system boots perfectly normally again. I made no other changes. Perhaps I’m jumping to conclusions, but seemed like a pretty smokey gun to me.

My fstab file looked like this after modding:

/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part1 swap                 swap       defaults              0 0
/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part2 /                    ext4       noatime,acl,user_xattr        1 1
/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part3 /home                ext4       noatime,defaults              1 2
    1 2

The boot.local file is now deleted, but looked like this when causing the problem:

 #! /bin/sh
#
# Copyright (c) 2002 SuSE Linux AG Nuernberg, Germany.  All rights reserved.
#
# Author: Werner Fink, 1996
#         Burchard Steinbild, 1996
#
# /etc/init.d/boot.local
#
# script with local commands to be executed from init on system startup
#
# Here you should add things, that should happen directly after booting
# before we're going to the first run level.
#
fstrim -v /
fstrim -v /home

try running as root the fstrim command from a command line. fstrim is a one time execution it does not change how things are mounted just does a trim operation on the file system.

Re=reading some stuff “All modern SSD’s support TRIM”

see http://en.wikipedia.org/wiki/Trim_(computing)

So trim is actually a function of the hardware and fstrim just triggers the process. So I wonder if it is a hardware or firmware problem

fstrim on the root partition works fine. On the home partition, it causes the problems observed in the OP, namely, desktop crashes and even basic commands cannot be carried out.


fstrim -v /
     /: 14.8GiB (.....bytes) trimmed

fstrim -v /home
     fstrim /home: FITRIM ioctl failed   Input/Output error

So, fstrim on the home partition takes a really long time, drives the CPU up for a while, then, after the error message, the screen goes black, the console window remains open but no commands can be executed other than “ls”. Only way to recover is press and hold the power button.

Again, this had all been working fine for many weeks. Just broke one day after an update.

Make the mount line the same in fstab as root ie do not use default

If that does not work run a fsck on the home partition.

How close to full is the drive? ext4 or BTRFS

Do you mean change the line for the home partition in fstab to match the parameters as in the line for the root? Can you tell me specifically how these lines should look? Should the “home” line have the “1 2” portion repeated as it does?


/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part1 swap       swap     defaults                               0 0 
/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part2 /          ext4       noatime,acl,user_xattr        1 1 
/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part3 /home     ext4       noatime,defaults                       1 2     1 2

If this is done, then you think issuing the fstrim command from the console will run correctly?

How close to full is the drive? ext4 or BTRFS

It is ext4 (see lines from fstab). Not close to full.


mxc@matrix:~> df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb2        20G  4.9G   14G  27% /
devtmpfs        3.9G  8.0K  3.9G   1% /dev
tmpfs           3.9G   96K  3.9G   1% /dev/shm
tmpfs           3.9G  4.8M  3.9G   1% /run
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs           3.9G  4.8M  3.9G   1% /var/lock
tmpfs           3.9G  4.8M  3.9G   1% /var/run
/dev/sdb3        89G  442M   87G   1% /home

Worth a shot

No. That’s invalid.
You should remove that additional “1 2”.

No idea if that could cause your problem though.

Do you mean change the line for the home partition in fstab to match the parameters as in the line for the root? Can you tell me specifically how these lines should look?

I suppose gogalthorp meant you should set the mount options for /home to the same as in your / line, i.e.

/dev/disk/by-id/ata-Corsair_Force_GT_11436508000010730093-part3 /home     ext4       noatime,acl,user_xattr                       1 2

But I don’t think changing the options will help, since acl,usr_xattr are actually the defaults for ext4 AFAIK.

It’s worth a try though.

Yep worth a try

I did not notice the odd values at the end of the line I think both should be 0’s. The first has to do with the dump command which seem to be no more and the second could be 2 and has to do with if fsck is run on the FS at reboot

My reasoning is the / works but /home does not. got a be a reason maybe the difference in the mount lines but I agree that the options are probably default though not 100% sure.

If it is not that it may be the files system is corrupted and needs fixing. ergo run fsck on it. The purpose of trim is to release blocks that are no longer needed and if the file system is messed then maybe it gets confused and crashes.

I changed the fstab mount lines to match paramaters, namely “noatime,acl,user_xattr”. Also, I deleted the extra “1 2” on the home partition. Root is “1 1” and home is “1 2”. Rebooted, tested a bunch of stuff - everything seems normal. Ran fstrim on both partitions - root goes fine and returns a value after a few seconds; home takes a long time before crashing the system and returning the previously reported error.

I tried to run fsck, but not sure how to do that successfully. First, it tells me sdb is mounted, so I try to umount, but it says it is busy. Just my lack of Linux knowledge, but how do you run fsck? I tried running in on sdb2 and sdb3 by booting from a Linux Mint DVD and it comes back very quickly saying “clean”, but not sure I’m getting a full file system check that way. I certainly agree that something seems corrupted on the home partition. I ran the Corsair utility back on the Win7 side - all shows normal, but of course, I can’t check anything to do with the file system from that side…need to be in Linux.

I went back into Mint and ran fsck with the -f flag. On both partitions, it said there was something wrong with the block count and inode count, which it asked to correct (I said yes). Perhaps this is causing the problems with fstrim?

After doing a bit of research on this specific error, it appears that others have run into issues with the Corsair SSDs under Linux.
See the following (unfortunately, not available directly, but in Google’s cache):
http://webcache.googleusercontent.com/search?q=cache:wksgOoMomWgJ:https://chrisdown.name/2014/01/06/trim-on-corsair-force-cssd-f120gb2.html+&cd=2&hl=en&ct=clnk&gl=us

He says:

TRIM on the Corsair Force CSSD-F120GB2

I experienced a few problems with TRIM on this drive:

ext4’s “discard” option is (very) slow on inode deletes on this drive;
Using fstrim results in “FITRIM ioctl failed: Input/output error”, and a warning in the kernel message buffer that DATA SET MANAGEMENT failed (this seems to be known about, although the answers from support are not helpful).

I initially thought that this might be a problem with the drive, but SMART looks fine, and it seems to operate normally otherwise.

I ended up using wiper.sh in a cron job. On Arch Linux, that’s bundled with the hdparm package.

There is also some good dialog on the trim function on SSDs under Linux on this forum, though a few years old now:
http://forums.opensuse.org/showthread.php/454999-Using-a-SSD-Hard-Drive-with-openSUSE-and-the-TRIM-Command

The interesting thing is that this problem showed up ‘out of the blue’. The fstrim command was being executed without problem for many weeks, then suddenly became problematic. Not sure if that is due to wear on the drive or what. All SMART analysis of the drive shows that it is ok. Perhaps using the cron approach mentioned above will work for me?

Sometimes it is good to check the vendor support pages of your SSD vendor for new firmware updates - the only problem is that in most cases to do so you need windows.

I suspect this is what fixed the problem. you had a corrupted file system.

What caused it is anyones guess maybe an improper shutdown or power failure???