UTF-8 Encoding Change

In order to make forum life easier on everyone going forward, we have changed all languages to a uniform UTF-8 encoding. The default language is English UTF-8. We did not, however, change the existing database (i.e. existing posts).

When reading old messages in UTF-8 format, some characters will not translate and you will see strange characters such as a diamond with a question mark. This will effect some languages more than others obviously. If you want to read older messages that aren’t readable in UTF-8 format, go to the lower right of any forum page and temporarily change your language to a non UTF-8 language. This will be a temporary language change. Please don’t post in non UTF-8 languages as these posts will not be readable when the non UTF-8 languages are eventually removed.

If you want to set your default language to something other than English, you can do this via the SETTINGS tab at the top of the forums. Go to SETTINGS / MY SETTINGS / MY ACCOUNT / GENERAL SETTINGS / FORUM LANGUAGE.

Our goal is to get everyone to start posting in UTF-8 format so soon our database will be mostly UTF-8 and encoding issues will slowly fade.

When you change your language to an UTF-8 one, even the home page of these forums show readable Greek, Mayar, etc.

HURRAY!

The difference between the various UTF-8 languages is the translations of the buttons, links, text, etc. for the forums. Each language has it’s own translations. However, as you said, FINALLY, you can read any forum in any UTF-8 language. Because we didn’t convert our entire database, it’s not perfect, but it’s much better than what we had.

تجربة اللغة العربيه
اوبن سوز

the two line upove written in arabic just to test the retrival of the arabic characters, i fing no need to change any thing, i think the database itself is UTF uniocode…
great work

Anything posted from today on with be UTF-8, such as your Arabic above. Messages posted prior to today may have missing characters when read in UTF-8 as the encoding of those messages are still as it was when they were posted. Over time, as people post, the UTF-8 messages will grow in the database until the most valuable data will be in UTF-8.

Hooray, a long awaited change. Well done and thanks!

On 2011-05-09 22:06, mojtabanow wrote:
>
> تجربة اللغة العربيه
> اوبن سوز
> the two line upove written in arabic just to test the retrival of the
> arabic characters, i fing no need to change any thing, i think the
> database itself is UTF uniocode…
> great work

The nntp side needs fixing, the above is displayed as garbage:

Content-Type: text/plain; charset=ISO-8859-1


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

Try changing your default language to English UTF-8 and all should be revealed. ;=)

Thanks a lot, Kim & al others involved !!! Much better, a relief for the eye

On 05/10/2011 03:36 AM, sleuth97 wrote:
>
> Try changing your default language to English UTF-8 and all should be
> revealed. ;=)

have you actually tried that on the “nntp side” ?

i have, and “all should be revealed” is not in Thunderbird…which news
client do you use which reveals all??


CAVEAT: http://is.gd/bpoMD
[openSUSE11.3 + KDE4.5.5 + Firefox3.6.17 + Thunderbird3.1.10 via NNTP]
HACK Everything → http://www.youtube.com/watch?v=j5b4CCe9pS8&NR=1

Carlos E. R. wrote:
> On 2011-05-09 22:06, mojtabanow wrote:
>> تجربة اللغة العربيه
>> اوبن سوز
>> the two line upove written in arabic just to test the retrival of the
>> arabic characters, i fing no need to change any thing, i think the
>> database itself is UTF uniocode…
>> great work
>
> The nntp side needs fixing, the above is displayed as garbage:
>
> Content-Type: text/plain; charset=ISO-8859-1

+1 Please fix NNTP. The messages are being sent with the wrong content
type as Carlos says.

NNTP is not broken. If your NNTP client is set to UTF-8, you will be able to read all characters posted in UTF-8. The default setting for most news readers is Western European which isn’t compatible with UTF-8 for many characters with ISO encoding.

Hey mojtabanow,

>تجربة اللغة العربيه
>اوبن سوز
>the two line upove written in arabic just to test the retrival of the
>arabic characters

Read and quoted and posting this reply from my NNTP client to show that
it works.


Kim - 5/10/2011 6:59:47 AM

One caviat. The web forum software default encoding is still ISO-8850-1 so it will send that as an enoding string to NNTP clients. If your client reads that and changes to western European encoding, you’ll have to change it back to UTF-8 to read.

kgroneman wrote:
> NNTP is not broken. If your NNTP client is set to UTF-8, you will be
> able to read all characters posted in UTF-8. The default setting for
> most news readers is Western European which isn’t compatible with UTF-8
> for many characters with ISO encoding. In other words, if you’re not
> able to read the special characters in your newsreader, it’s your
> newsreader that is the problem.

I don’t think so. The NNTP feed was and perhaps still is broken. The
source of the message that Carlos and I commented on and which didn’t
display correctly:

Code:

Path:
kozak.provo.novell.com!kortar.provo.novell.com!kovat.provo.novell.com.POSTED!18036d74!not-for-mail
From: mojtabanow <mojtabanow@no-mx.forums.opensuse.org>
Newsgroups: opensuse.org.news.announcements
Subject: Re: UTF-8 Encoding Change
Message-ID: <mojtabanow.4thkwd@no-mx.forums.opensuse.org>
Organization: forums-opensuse.provo.novell.com
User-Agent: vBulletin USENET gateway
X-Originating-IP: 196.1.230.107
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
References: <kgroneman.4thdbi@no-mx.forums.opensuse.org>
<hcvv.4thfc9@no-mx.forums.opensuse.org>
<kgroneman.4thfca@no-mx.forums.opensuse.org>
Lines: 15
Date: Mon, 09 May 2011 20:06:03 GMT
NNTP-Posting-Host: 137.65.225.252
X-Trace: kovat.provo.novell.com 1304971563 137.65.225.252 (Mon, 09 May
2011 14:06:03 MDT)
NNTP-Posting-Date: Mon, 09 May 2011 14:06:03 MDT
Xref: kortar.provo.novell.com opensuse.org.news.announcements:1165

تجربة اللغة العربيه
اوبن سوز
the two line upove written in arabic just to test the retrival of the
arabic characters, i fing no need to change any thing, i think the
database itself is UTF uniocode…
great work


mojtabanow

mojtabanow’s Profile: http://forums.opensuse.org/member.php?userid=58064
View this thread: http://forums.opensuse.org/showthread.php?t=459552

And the code of your message, which does display correctly:
Code:

Path:
kozak.provo.novell.com!kortar.provo.novell.com!kovat.provo.novell.com.POSTED!53ab2750!not-for-mail
From: “kgroneman” <kgroneman@novell.com>
Subject: Re: UTF-8 Encoding Change
Newsgroups: opensuse.org.news.announcements
References: <kgroneman.4thdbi@no-mx.forums.opensuse.org>
<hcvv.4thfc9@no-mx.forums.opensuse.org>
<kgroneman.4thfca@no-mx.forums.opensuse.org>
<mojtabanow.4thkwd@no-mx.forums.opensuse.org>
User-Agent: XanaNews/1.19.1.269
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Lines: 12
Message-ID: <VFayp.836$CL.707@kovat.provo.novell.com>
Date: Tue, 10 May 2011 13:00:37 GMT
NNTP-Posting-Host: 76.23.10.66
X-Trace: kovat.provo.novell.com 1305032437 76.23.10.66 (Tue, 10 May 2011
07:00:37 MDT)
NNTP-Posting-Date: Tue, 10 May 2011 07:00:37 MDT
Xref: kortar.provo.novell.com opensuse.org.news.announcements:1173

Hey mojtabanow,

>تجربة اللغة العربيه
>اوبن سوز
>the two line upove written in arabic just to test the retrival of the
>arabic characters

Read and quoted and posting this reply from my NNTP client to show that
it works.


Kim - 5/10/2011 6:59:47 AM

Note that the two messages were SENT with different content types,
which our newsreaders are doing their best to honour.

kgroneman wrote:
> One caviat. The web forum software default encoding is still ISO-8850-1
> so it will send that as an enoding string to NNTP clients. If your
> client reads that and changes to western European encoding, you’ll have
> to change it back to UTF-8 to read.

Err, so what you’re saying is that the web forum sends broken content to
the NNTP clients! Which is what Carlos and I and DenverD are saying!

In any case that’s an incomplete statement. Please check my previous
message. Sometimes UTF-8 content is sent with ISO-8859-1 content type
and sometimes with utf-8.

If you’re suggesting I should somehow instruct my news client to treat
just this forum in some way to account for it’s brokenness, I suggest
rather that the forum software and/or NNTP gateway is set to send the
correct headers for the content type. Only it has any chance of knowing
what the correct value is.

On 2011-05-10 03:36, sleuth97 wrote:
>
> Try changing your default language to English UTF-8 and all should be
> revealed. ;=)

That is not possible in nntp. You can do that for the current message in
thunderbird, but it does not stick. You have to do it for every single
message every single time.


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On 2011-05-10 15:01, kgroneman wrote:
>
> NNTP is not broken. If your NNTP client is set to UTF-8, you will be
> able to read all characters posted in UTF-8.

No, you are wrong. The server is sending the posts with UTF coding, but
saying it is using ISO coding. The clients display it as they must, in ISO,
producing bad results.

It is the server that is the problem. I know. :expressionless:


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On 2011-05-10 15:00, kgroneman wrote:
> Hey mojtabanow,
>
>> تجربة اللغة العربيه
>> اوبن سوز
>> the two line upove written in arabic just to test the retrival of the
>> arabic characters
>
> Read and quoted and posting this reply from my NNTP client to show that
> it works.
>

Because your post contains:

Content-Type: text/plain; charset=utf-8


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On 2011-05-10 22:20, Carlos E. R. wrote:
> On 2011-05-10 15:00, kgroneman wrote:
>> Hey mojtabanow,
>>
>>> تجربة اللغة العربيه
>>> اوبن سوز
>>> the two line upove written in arabic just to test the retrival of the
>>> arabic characters
>>
>> Read and quoted and posting this reply from my NNTP client to show that
>> it works.
>>
>
> Because your post contains:
>
> Content-Type: text/plain; charset=utf-8

And because it has been sent via NNTP, without passing the gateway or the
webforum. It has been handled all the way by the NNTP server, so that a
reply to an nntp post (yours) is done in the same encoding (mine).


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)