Filename corruption with non-standard characters (cyrilluc, swedish) - Leads to damaged files

Hi,

I’m using LEAP 15 with KDE, I have this issue where files with foreign characters get displayed with ??? (See attachment), the problem is I cannot rename or otherwise modify these files, even deleting them results in an error saying the file does not exist (from dolphin). I had to use rm -rf to delete the folders and it’s contents. The only thing I can imagine might be connected is issues I have with baloo file indexer. Here’s an example of the issue.

https://i.imgur.com/e9IHuf7.png

Thanks.

Definitely not.

It’s rather a problem with the locale settings IMHO.

What do you get when you run “locale” in Konsole or xterm inside your KDE user session?


LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

Right, the “C” locale only supports 8bit ASCII but the filenames are encoded in UTF-8.

Have a look into systemsettings5->Regional Settings->Formats and choose a proper region, or do that in YaST for systemwide configuration (will be overridden by Plasma for the user though if you ever changed it in systemsettings5).

To me this looks very much like the locale for root, not for a normal user.

There’s absolutely no indication for or against this theory in the posted output.

It was set to Default, don’t know how I’ve managed that, output now reads:

LANG=en_GB.US-ASCII
LC_CTYPE="en_GB.US-ASCII"
LC_NUMERIC="en_GB.US-ASCII"
LC_TIME="en_GB.US-ASCII"
LC_COLLATE="en_GB.US-ASCII"
LC_MONETARY="en_GB.US-ASCII"
LC_MESSAGES="en_GB.US-ASCII"
LC_PAPER="en_GB.US-ASCII"
LC_NAME="en_GB.US-ASCII"
LC_ADDRESS="en_GB.US-ASCII"
LC_TELEPHONE="en_GB.US-ASCII"
LC_MEASUREMENT="en_GB.US-ASCII"
LC_IDENTIFICATION="en_GB.US-ASCII"
LC_ALL=

The issue has been resolved. However, I get a message that:

LC_CTYPE
LC_MESSAGES
LC_ALL

Cannot be set to default locale, no such file or directory; is this a problem?
Thanks for your replies either way, the problem is solved. Feel free to close the thread upon reply.

When/where do you get that message?

It might indicate that the system-wide locale is set to an unknown one, causing a fallback to “C”.
Would explain why the problem occured in the first place. (“Default” in systemsettings5 means use the system default)

What does the command “localectl” tell?

Feel free to close the thread upon reply.

We don’t close threads here.

I get that message when I run locale, after I changed the regional settings.

localectl says

   System Locale: LANG=en_US.UTF-8
       VC Keymap: us
      X11 Layout: us

Ok, I can reproduce it:

wolfi@linux-lf90:~> LANG=en_GB.US-ASCII locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_GB.US-ASCII
LC_CTYPE="en_GB.US-ASCII"
LC_NUMERIC="en_GB.US-ASCII"
LC_TIME="en_GB.US-ASCII"
LC_COLLATE="en_GB.US-ASCII"
LC_MONETARY="en_GB.US-ASCII"
LC_MESSAGES="en_GB.US-ASCII"
LC_PAPER="en_GB.US-ASCII"
LC_NAME="en_GB.US-ASCII"
LC_ADDRESS="en_GB.US-ASCII"
LC_TELEPHONE="en_GB.US-ASCII"
LC_MEASUREMENT="en_GB.US-ASCII"
LC_IDENTIFICATION="en_GB.US-ASCII"
LC_ALL=

LANG=en_GB.UTF-8 works fine though, and should be preferred anyway.

wolfi@linux-lf90:~> LANG=en_GB.UTF-8 locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=

I suppose, systemsettings5 tries to conserve the US-ASCII/UTF-8 “setting” when you select a different region (and “C” uses US-ASCII…).

I’m currently not sure how to best circumvent this in the GUI, but you can edit ~/.config/plasma-localerc and ~/.config/plasma-locale-settings.sh directly with a text editor (replace US-ASCII with UTF-8 in both files).

localectl says

   System Locale: LANG=en_US.UTF-8
       VC Keymap: us
      X11 Layout: us

Hm, that actually looks fine. But then, the settings may be invalid and maybe localectl just displays “something nice” in that case.

What’s actually contained in /etc/locale.conf ?
And what does “locale” say when you login (as user) in text mode?
(press e.g. Ctrl+Alt+F1 to switch to a text mode console, you can switch back to graphics mode with Ctrl+Alt+F7)

Switched to en_GB.UTF-8, makes little difference to me, and the error message have disappeared.

It says LANG=en_US.UTF-8, I guess I can switch to this if it changes anything, but as it stands this works as well.

Running it in a different tty gave no errors before I switched to **en_GB.UTF-8. **

No, en_GB.UTF-8 should be fine too, the .US-ASCII caused the error messages (and UTF-8 is preferred since years or even over a decade).

I’m just trying to find out why it originally used “C”, which must come from somewhere…

Running it in a different tty gave no errors before I switched to **en_GB.UTF-8. **

Changing Plasma’s settings shouldn’t have any effect on a text mode login.

That’s actually why I asked for locale’s output in text mode, to maybe see if the “problem” (using “C” as locale) is specific to Plasma or on some lower level.

If you are satisfied with having your problem solved, we can of course just as well stop here… :wink:

Seems I was slightly wrong previously anyway: “Default” in Plasma’s settings (which is actually “Default (C)”) does not refer to the system default (this would rather be “No Change”), but it does explicitly set “C” as locale ($LANG).

So a likely reason was probably just a wrong Plasma setting (for whatever reason).

I just tried a fresh user account here as a test (with LANG=de_AT.UTF-8 and LANG=en_US.UTF-8 in /etc/locale.conf) and didn’t have the problem.
Although, the Plasma default was de_AT.UTF-8 in both cases. That’s what I have set in /etc/sysconfig/language (this is an upgraded system). I thought that isn’t used anymore (in favor of /etc/locale.conf), but as I saw now the “startup scripts” (/etc/profile.d/lang.sh in particular) do still evaluate it and set $LANG accordingly on login.

Maybe your wrong default comes from there?

grep RC_LANG /etc/sysconfig/language

The settings in /etc/locale.conf do seem to get used (as system default) if there are no explicit language settings in /etc/sysconfig/language though.

To be honest, I distro hopped a few times between settling on LEAP 15, and did a lot of quick installations so I might’ve inadvertently set that locale as default during install and didn’t notice until using non-standard characters revealed the problem. As far as I’m concerned the issue is solved, and I wouldn’t want to waste anyones time anymore :slight_smile: although I learned a few things in the process. Thanks a whole lot for the quick and helpful replies.

Ok.

It still may be a good idea to check the /etc/sysconfig/language settings (or the output of locale when you login in text mode).
Although it doesn’t really make a difference as long as you don’t create other users or use other desktops (which won’t respect Plasma’s settings).

Oh, and just to be sure: you are not logged in as root, are you?
Root’s default locale ($LANG) is indeed set to “C” on login by the system, regardless of the actual configured system locale. (that’s configurable though, also in /etc/sysconfig/language)