Nautilus not displaying Unicode characters

Hey all,
I’m using openSUSE 11.2 with GNOME dual-booted with Windows 7, been installed from scratch for like a week. The bottom line is: Nautilus displays a series of matrices, "x"s and other symbols instead of characters in Hebrew.

Screenshot:
http://img11.yfrog.com/img11/4959/screenshotkg.png

Now, it worked fine at the beginning but once I started installing updates it went. I installed a whole bunch of updates and programs so I don’t know what changed it.
The weird part is (as you can see in the screenshot) that the shortcut to the left of a Hebrew-named folder shows up correctly only the first time Nautilus opens after starting. So as soon as I closed the Nautilus window after taking the screenshot and reopened it, it also displayed like the others.
The screenshot is of my ntfs Windows drive, however the problem occurs in my home folder as well.

Here’s my fstab anyway:

/dev/disk/by-id/ata-ST9160821AS_5MA727CM-part5 swap                 swap       defaults              0 0
/dev/disk/by-id/ata-ST9160821AS_5MA727CM-part6 /                    ext4       acl,user_xattr        1 1
/dev/disk/by-id/ata-ST9160821AS_5MA727CM-part7 /home                ext4       acl,user_xattr        1 2
/dev/disk/by-id/ata-ST9160821AS_5MA727CM-part2 /windows/C           ntfs-3g    defaults,locale=en_US.UTF-8 0 0
proc                 /proc                proc       defaults              0 0
sysfs                /sys                 sysfs      noauto                0 0
debugfs              /sys/kernel/debug    debugfs    noauto                0 0
usbfs                /proc/bus/usb        usbfs      noauto                0 0
devpts               /dev/pts             devpts     mode=0620,gid=5       0 0

Any help would be extremely appreciated. This is driving me bonkers already!

I suppose that in general it will display Unicode because even ASCII is a subset of Unicode (look at the .pdf, they are also Unicode). It does not display certain characters and, as I see it, because it does not have the fonts belonging to that range of characters. Could you check if you have a font for those characters installed?

After reading your post again (maybe should have given it more attention first time :shame: ) it looks as if there is at least a font (as you show left). But those rectangles are normaly shown when no glyph is available (and they show the Unicode value in hex).

What happens when you change the view from the fancy one to a more formal one in columns?

Thanks for the speedy response.
The problem isn’t with the view format, since it appears the same way using Icon, List or Compact view. I just reread my post ad I apparently forgot to mention that trying to rename any file to a Hebrew name, such as שלום.jpeg results in the following error message:
“The item could not be renamed.
The name “שלום.jpeg” is not valid. Please use a different name.”
Notice the word שלום appears correctly in the error message.
Any other suggestions, please?

Hm, it is all as foggy to me as for you I am afraid :frowning: .

Only thing is that I do not trust is the involvement of a non-Linux file system in this. Are you sure all this happens also on a pure and genuine Linux file system like ext4, ext3, ext2 or even Reiserfs?

And going back to the basics. What about leaving out the silly file manager and doing a good old

ls -l

from a terminal emulator?

Unfortunately, in terminal I also see just a series of question marks and x’s instead of the filename so maybe it isn’t a problem with Nautilus after all.
Hebrew is, however, added as a secondary language in YaST/Lanuage (with en_US being my primary). Am I missing some settings somewhere?

First I want to know if all these test are done on a Linux filesystem or not. I can can not comment on any usage of non-Linux filesystems.

The fact that you have a language installed does not mean much to our problem imho. I know the whole is not as simple as one thinks at first. Maybe not even for me. To help both of us here some thoughts about the subject:

. There is a difference between ‘language’ and ‘characterss/script/alphabeth’. I e.g. have dutch installed as a language. This means that I (hope to) see messages, help windows, titles in buttons, in short all text on my desktop in dutch.

. The characters used in these dutch texts are stored on the system as numbers. Now the characters used for dutch are in general the same as those used in english and their numbers are defined in ASCII, or by extension in ISO-9959-1, which again is a subset of Unicode.

. When these characters are displayed on a screen (or paper) there must be glyphs connected to each of those numbers, those glyphs are gathered into sets we call fonts in short. When there is no font, the number may be on disk, but it can not be displayed correct.

. How do we get those numbers into the system? That is often done using a keyboard and as you are aware of (I suppose you are able to enter Hebrew via a keyboard) there are different solutions for that: special keyboards, special key-bindings.

. Storage. How do we store the numbers? ASCII is simple, nowadays an ASCII character (0 -127) is stored in a byte and the first bit is unused. This is also true if ISO-9959-1 characters are stored as ISO-9959-1 (sounds strange, but the splitting between character table and encoding is still not there in the different ISO-9959-x standards), but the whole byte (0 - 255) is used.
Unicode can not be stored in one byte for most of the characters. Thus encoding is needed, UTF-8 is the one we are talking about. When you encode in UTF-8 even the second half of the ISO-9959-1 set (128 - 255) is encoded in two bytes, thus there is allready a difference between een é in ISO-9959-1 encoding (235) and UTF-8 (195:171).

. Nowadays Linux is very much Unicode/UTF-8. That not only means that editors, etc. generate these as you type, but also that the filesystems store the names of directories/files in it. That does not mean that there may not be file systems that still uses a one-byte-per-character encoding and thus accomodate at most the ISO-9959-x sets.
I remember someone having an old Linux filesystem with this problem and I have no idea which MS files sytems and from when on do or do not support Unicode/UTF-8.

The above is of course a lot of theoretics, But they do not explain why you sometimes see filenames correctly displayed and sometimes not by the same program on the same place (if a recapitulate your experience correct).
The fact that ls shows strange characters would suggest to me that there are Unocode/UTF-8 filenames, but that ls is not aware of it (interpreting the individual bytes). Which would lead me to the idea that the file system is advertised (at mount time) as being NOT Unicode/UTF-8, but in fact it is. That would be strange with a Linux fs (but I could understand this with a non-linux fs where you have to arange this by mounting correct).

Thanks for the lengthy reply.
I copied the content’s of the folder shown in the first screenshot to a new directory on my Linux drive just to make 100% sure.
Here’s a new screenshot of the new folder:

http://img406.imageshack.us/img406/8238/screenshotby.png

As you can see, same problem occurs here (also when checking in terminal with “ls -l” as you suggested earlier). I also recreated the error message I spoke about earlier and as you can see in the shortcut frame to the left the previously Hebrew shortcut now become matrices.
Thanks again for taking the time to help me.

I jst did something that must be similar to what you did. I made a file with Devanagri characters as file name. It shows as

henk@boven:~/vreemd> ls -l
totaal 0
-rw-r--r-- 1 henk wij 0 mei  7 12:08 धआ
henk@boven:~/vreemd>

And when I use the file manager (Dolphin in KDE4) it shows also correct.

Did you fix it somehow?

Another thought: Does my problem have to with the System/Environment/Language settings in etc/sysconfig at all? I’m kinda shooting blind in the dark here. I have no clue what to try anymore.
Can anybody else maybe think of something?

rel dude wrote:
> Did you fix it somehow?

maybe when Henk reads your he will have a different answer, but i
think his post means:

the problem you see does not exist on a properly setup and maintained
system…

that is, there is something about your system which does not match his
fully updated system…

HOWEVER, i note that you are using Nautilus in Gnome and Henk is using
Dolphin in KDE4…so, that alone might be the difference…

HOWEVER-2: i guess that not everyone in the world using Nautilus in
Gnome is having this problem, so i guess there is a setup problem in
your machine…

the key to your current situation is held in this line you wrote “Now,
it worked fine at the beginning but once I started installing updates
it went. I installed a whole bunch of updates and programs so I
don’t know what changed it.” which means to me that somehow something
or someone (you?) did something to mess it up…

the way to fix it then is to (somehow) return to the machine state
when it was “working fine”…you can do that (maybe) by undoing
everything you did to kill it…

one thing you can try is to use YaST to make a new user, lets say add
the user named Test…then log out of your account and into the
account of Test and see if it works correctly there…if so, you then
know the problem lies somewhere in YOUR HOME…that is something
inside one of the hidden setup files in /home/[you] is wrong, while
inside /home/Test it is good…

with lots of patience and using a program like diff you can find the
error…but take LOTS of patience…


DenverD (Linux Counter 282315)
CAVEAT: http://is.gd/bpoMD
posted via NNTP w/TBird 2.0.0.23 | KDE 3.5.7 | openSUSE 10.3
2.6.22.19-0.4-default SMP i686
AMD Athlon 1 GB RAM | GeForce FX 5500 | ASRock K8Upgrade-760GX |
CMedia 9761 AC’97 Audio

As DenverD allready guessed, I have not fixed anything. I told only that I do not have the problem using KDE 4 and Dolphin, and what is more, I showed that the CLI (the* ls* command in this case) does work as I would expect.

And as said before, it has nothing to do with language settings, It has to do with the capability of the individual file system to store filenames in UTF-8 and the ability of any software that interpretes those strings, to do so as UTF-8.

You said you had a bunch of updates and DenverD suspects with reason that after this you have the problem (as you stted yourself). I think it is indeed time to check for yourself what “bunch of updates” that was. Especilay from what repos and if you stick to the four (4) standard repos, or if you have a whole bunch of others enables (there are several threads with terrible messy repos management running at this very moment).

Thank you both for your time and effort. It looks like I solved the problem.
After creating a new “test” user, as DenverD suggested, I saw that Nautilus was showing the filenames perfectly and I could edit them as well. So after comparing the two home folders I narrowed it down to a difference in the .dmrc file.

The test user’s .dmrc appeared as follows:


[Desktop]
Session=gnome
Language=C

While mine was:

[Desktop]
Session=gnome
Language=en_US
Layout=us

I noticed that in my Language field there was no mention of UTF-8 so I tried adding it so that it looked like this:

[Desktop]
Session=gnome
Language=en_US.UTF-8
Layout=us

That did the trick! It now works fine. Logging out and back in just to make sure proved that it solved the problem.
Thank you both again for your help. Couldn’t have done it without you.

You are welcome. Must raise you self confidence that you managed to find it after only a hint :wink:

And I also learned about .dmrc (still do not know what it is for, but hope that somewhere in the back of my mind it comes up when needed).

BTW my .dmrc only contains

[Desktop]
Session=default

hcvv wrote:
> And I also learned about .dmrc (still do not know what it is for, but
> hope that somewhere in the back of my mind it comes up when needed).

i do not know what it is for either, but mine only contains

[Desktop]
Session=kde


DenverD (Linux Counter 282315)
CAVEAT: http://is.gd/bpoMD
posted via NNTP w/TBird 2.0.0.23 | KDE 3.5.7 | openSUSE 10.3
2.6.22.19-0.4-default SMP i686
AMD Athlon 1 GB RAM | GeForce FX 5500 | ASRock K8Upgrade-760GX |
CMedia 9761 AC’97 Audio