just wondering what happens to utf-8 chars in posts

Euro sign €
a umlaut ä
e grave è
c cedilla ç
n tilde ñ

Admins feel free to delete this post anytime.

Edit: looks like the forum software does the right thing with European characters. I should try Asian characters sometime.

The current time from BBC’s Chinese site:

2008年07月17日 格林尼治标准时间03:18北京时间 11:18发表

> Edit: looks like the forum software does the right thing with European
> characters.

unfortunately, those do not make it to all forum members in a readable
format, yet…

because the nntp gateway just doesn’t know how to send’em…
a KNOWN problem that i am NOT :slight_smile: complaining about…just informing
ken_yapa that not all will see those characters the same way he saw them
in his test… here they looked more like &#2_4_1;_ without the _s
(which i put in so the web software wouldn’t read them out correctly
again :wink:


DenverD (Linux Counter 282315)
A Texan in Denmark

On Thu, 17 Jul 2008 15:49:31 GMT
DenverD <spam.trap@Texan.dk> wrote:

> > Edit: looks like the forum software does the right thing with
> > European characters.
>
> unfortunately, those do not make it to all forum members in a
> readable format, yet…
>
> because the nntp gateway just doesn’t know how to send’em…
> a KNOWN problem that i am NOT :slight_smile: complaining about…just informing
> ken_yapa that not all will see those characters the same way he saw
> them in his test… here they looked more like &#2_4_1;_ without
> the _s (which i put in so the web software wouldn’t read them out
> correctly again :wink:
>
Hi
If the text was a cut/paste then that’s why we see the said funny
characters.


Cheers Malcolm °¿° (Linux Counter #276890)
SLED 10 SP2 i586 Kernel 2.6.16.60-0.23-default
up 19:31, 2 users, load average: 1.20, 1.43, 0.88
GPU GeForce Go 6600 TE/6200 TE Version: 173.14.09

Actually the European diacritics were done with the Compose key and the Chinese characters with cut and paste because I haven’t got CJK input yet. Too bad the NNTP gateway doesn’t do the right thing for you. But, seeing as it’s apparently encoded as XML entities correctly, isn’t it an issue with the NNTP reading software? But it’s quite a complex issue and it could still be displayed wrongly if the server assumes Latin-1 and the client UTF-8 or v.v. You’d have to look at the Content-charset header in the NNTP headers plus the exact entities sent for each diacritic.

Not that I’m expecting a flood of users of diacritics, though it would be nice for European users to spell their names correctly, let alone CJK users. But I’m curious about how international the forum software is. Some of the software out there is in the dark ages. You often see smart quotes like these: “” which are also non-ASCII characters, mess up webpages.

BTW, this is what happens when UTF-8 is not properly converted to ISO8859-1 when rendered by the forum software:

openSUSE Weekly News, Issue 31 - openSUSE Forums

That ’ is actually the right quote character ’. I don’t have any idea where it got mangled. And <deity> knows what you NNTP subscribers are seeing.

On Fri, 18 Jul 2008 07:36:04 GMT
ken yap <ken_yap@no-mx.forums.opensuse.org> wrote:

>
> BTW, this is what happens when UTF-8 is not properly converted to
> ISO8859-1 when rendered by the forum software:
>
> ‘openSUSE Weekly News, Issue 31 - openSUSE Forums’
> (http://tinyurl.com/5w6u9w)
>
> That ’ is actually the right quote character ’. I don’t have any
> idea where it got mangled. And <deity> knows what you NNTP
> subscribers are seeing.
>
>
Hi
I see an a with a ^ above it rather than what I assume should be a ’


Cheers Malcolm °¿° (Linux Counter #276890)
SLED 10 SP2 i586 Kernel 2.6.16.60-0.23-default
up 1 day 15:03, 1 user, load average: 0.26, 0.27, 0.25
GPU GeForce Go 6600 TE/6200 TE Version: 173.14.09

> And <deity> knows

:slight_smile: yep, sometimes kinda ugly…


DenverD (Linux Counter 282315)
A Texan in Denmark