Create user with Unicode username

Hi,

I’m trying to create a unicode username “вася”, but it seems openSUSE 12.3 does not allow that:

http://img-fotki.yandex.ru/get/5635/73339514.26/0_7fb75_4a9cc83d_orig

useradd also fails:

useradd: invalid user name ‘вася’

But Ubuntu seems to allow this:

useradd вася

id вася

uid=1001(вася) gid=1001(вася) groups=1001(вася)

Can anybody explain - why openSUSE has this restriction?

Thanks

Hm, this is from the useradd man page:

The account name must begin with an alphabetic character and the rest of the string should be from the POSIX portable character class ([A-Za-z_][A-Za-z0-9_-.]*[A-Za-z0-9_-.$]).

Which tells the same as the error message.

The man page of the passwrd file says:

name
This is the user’s login name. It should not contain capital letters.

Which is quite different :frowning:

I do not have Ubuntu at hand here, but could you check what it’s man page says. And if those characters are realy stored in the password file as UTF-8 encoded Unicode?

On 03/19/2013 03:46 PM, tosiara wrote:
>
> Can anybody explain - why openSUSE has this restriction?

not all possible openSUSE system languages consider the kinds letters
you are using as “legal” for file or directory names…what system
language are you using?

that is, what do you see if you put this into a user terminal:

env | grep LANG


dd
http://tinyurl.com/DD-Caveat

Ubuntu 12.10:

# useradd вася
# grep -n вася /etc/passwd
45:вася:x:1003:1003::/home/вася:/bin/sh

my openSUSE 12.3 has:

# env | grep LANG
LANG=en_US.UTF-8
LANGUAGE=

Sorry, some misunderstanding, but listing what the Ubunto says is no amswer to my question how it is stored internaly in the /etc/passwd file. I guess that would involve an od of it.

Also, what says the man page (and when it is in Russian, please tell us what you understand of it.

Try changing CHARACTER_CLASS in /etc/login.defs. I would be interested in your experience, where it breaks :slight_smile:

On 03/19/2013 05:26 PM, tosiara wrote:
>
> LANG=en_US.UTF-8
>

you could try changing your system language to Russian and see if
that solves the problem…


dd
http://tinyurl.com/DD-Caveat

On 2013-03-19 17:26, hcvv wrote:
> how it is stored internaly in the /etc/passwd
> file

“file /etc/passwd” would probably say, but the obvious guess is “UTF”.


Cheers / Saludos,

Carlos E. R.
(from 12.1 x86_64 “Asparagus” at Telcontar)

you could try changing your system language to Russian and see if that solves the problem

No, this didn’t help

Try changing CHARACTER_CLASS in /etc/login.defs. I would be interested in your experience, where it breaks :slight_smile:

That worked! I have added Cyrillic symbols and was able to create new user using "useradd’. Yast still refuses and complains about invalid characters

my question how it is stored internaly in the /etc/passwd file

Here is xxd dump from openSUSE after creating вася2 user:

grep вася2 /etc/passwd | xxd
0000000: d0b2 d0b0 d181 d18f 323a 783a 3130 3032  ........2:x:1002
0000010: 3a31 3030 3a3a 2f68 6f6d 652f d0b2 d0b0  :100::/home/....
0000020: d181 d18f 323a 2f62 696e 2f62 6173 680a  ....2:/bin/bash.

Here is proof of concept:


$ mail 
No mail for вася2

$ telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 extsmtp.test.lan ESMTP
helo localhost
250 extsmtp.test.lan
mail from:<вася2@extsmtp>
250 2.1.0 Ok
rcpt to:<вася2@extsmtp>
250 2.1.5 Ok
data
354 End data with <CR><LF>.<CR><LF>
Subject: Тест
From: вася2 <вася2@extsmtp>
To: вася2

абвгдеёжз

.
250 2.0.0 Ok: queued as EDCA32351B
quit
221 2.0.0 Bye
Connection closed by foreign host.

$ mail
Heirloom mailx version 12.5 7/5/10.  Type ? for help.
"/var/spool/mail/вася2": 1 message 1 new
>N  1 вася2@extsmtp.test Wed Mar 20 10:11   16/555   Тест
? type
Message  1:
From вася2@extsmtp.test.lan  Wed Mar 20 10:11:10 2013
X-Original-To: вася2@extsmtp
Delivered-To: вася2@extsmtp.test.lan
Subject: Тест
From: вася2 <вася2@extsmtp.test.lan>
To: вася2@extsmtp.test.lan
Date: Wed, 20 Mar 2013 10:10:07 +0200 (EET)

абвгдеёжз

?

As Carlos suggested, the file tool will tell you if it finds legal UTF-8 byte combinations in a file it otherwise decides as being text (but always remember, file does “only” make intelligent guesses).

I copy/pasted your вася into a file

henk@boven:~/test> cat rr
вася
henk@boven:~/test> file rr
rr: UTF-8 Unicode text
henk@boven:~/test>

I could have started to decode the hex you offered me, but unless you reaaly want me to do that, I think this says it. Try

file /etc/passwd

If all software is able to handle UTF-8 correctly is something different. But as UTF-8 is allready some time supported, I guess the majority of system software can do.

YaST apperently still does not allow this. I real wonder how people in Cyrilic, Arabic, etc. countries using openSUSE do enter their users using YaST (or even useradd).

Looking around with Google it seems that different distros have different ideas about the alowed characters. :frowning:

# file /etc/passwd
/etc/passwd: UTF-8 Unicode text

Fine, as expected.

The main thing to worry about now is if you laid a bomb somewhere that will go off (unexpected) in the future. When any software using the username is not UTF-8 ready. Difficult to predict. But keep it in mind.

BTW, I wonder if this would be worth a bug report: Can not enter Cyrilic usernames through YaST and or useradd. Simply for finding out what the answer is. The answer could shed light on possible pitfals.

https://bugzilla.novell.com/show_bug.cgi?id=810482

What I miss there is what you changed in /etc/login.defs.

I do not know what exactly you did, but does it allow all Unicode characters (except the few that are not allowed because they will create havoc when used in a username like : and @), or did you only solve it for Cyrilic?

I only did it for my specific test-case:

CHARACTER_CLASS    [ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_в]асяABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_.-]*[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_.$-]\?

Thanks, but I mean to add something to the bug. You could say

I did the following for Cyrilic, but I do not know how to do this for all alphabetics in Unicode.

They then could decide if this is the right way to solve it or that something completely differnt is required.

But that is only my idea. Important is that the bug report exists. Thanks.
I realy like the subject.

One more update:


$ mail -s абвгд вася2@extsmtp 
вася2@extsmtp contains invalid character '\320'

:frowning:

But mutt works fine!


From root@extsmtp.test.lan  Wed Mar 20 16:03:48 2013
X-Original-To: вася2@extsmtp
Delivered-To: вася2@extsmtp.test.lan
Date: Wed, 20 Mar 2013 16:03:48 +0200
From: root <root@ptr-extsmtp.localdomain>
To: вася2@extsmtp.test.lan
Subject: абвгд
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
User-Agent: Mutt/1.5.21 (2010-09-15)

test

:wink:

Well … E-Mail addresses were ASCII only for from the day one. It will take years before RFC 6513 is universally implemented (if ever).

What will you do with E-Mail address given to you in Chinese? Are you able to even enter it on Russian keyboard?

I agree, using unicode in a global newtork is a bad idea (there are chars that look similar but has different codes, like ‘c’ and ‘с’). But I was only looking for a solution for inside-company only. And I’m happy that my favorite Linux does offer the choice to me!