UTF-8

New posts are OK. But in the other ones, the special characters got replaced with question marks now in the french forum (were OK before). :frowning:

On Mon, 09 May 2011 22:36:01 +0000, please try again wrote:

> New posts are OK. But in the other ones, the special characters got
> replaced with question marks now in the french forum (were OK before).
> :frowning:

Yes, that is as expected. We didn’t convert the existing database to
UTF-8, just configured things so new posts would be UTF-8.

Converting the database would take many hours, longer to back it up and
restore it if something went wrong.

Jim


Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

If you need to read old messges, you can temporarily change your encoding language from the box in the lower right of any forum page.

Thank you. I got it.

Oh, thanks for the info. I didn’t notice that little list. :slight_smile:

This is weird. This post (probl�me drivers nvidia) was posted today. But I had to switch to old endcoding to be able to quote it in my answer (otherwise the page would not load). Now I switched back to UTF8 (see the thread’s name). In old encoding, it would have printed only probl.

IMHO, you should convert the entire database to UTF-8. We keep switching between UTF-8 and old encoding most of the time within the same thread, because the encoding used for the display is the one used originally by the poster. There is no way to convince everyone to post in UTF-8 and anyway the database should handle different encoding inputs, not the poster. As a immediate (hopefully temporarily) workaround, you could allow posters to change the encoding for reading only, so that they would not write new posts using old encoding.

By the way, I don't see the option 'English (old encoding). So if I order to read old encoded messages in french, I have to switch to French (old encoding), which displays the posts correctly but also translates the user interface in french. However, being able to read/write posts in french doesn't mean that you want everything else to appear in french nor that you can easily and efficiently use it.

That would be nice. But it would require a lot of effort.

After a while, you will mainly be looking at posts that are new enough to be in UTF-8. So it just requires a little patience.

It would not have required ‘lot’ of effort if they had done that before switching the site over. Now that they are both types of characters in the database, I’m afraid it’s too late.

I’m not talking about old posts.

On Wed, 25 May 2011 20:06:02 +0000, please try again wrote:

> It would not have required ‘lot’ of effort if they had done that before
> switching the site over. Now that they are both types of characters in
> the database, I’m afraid it’s too late.

It would have required the site be offline for longer, and the return was
determined not to be sufficient to warrant the downtime, given the
workaround that works for getting at the older posts and that things
eventually will get to a point where switching won’t be necessary.

Jim


Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

On 2011-05-26 04:35, Jim Henderson wrote:
> On Wed, 25 May 2011 20:06:02 +0000, please try again wrote:
>
>> It would not have required ‘lot’ of effort if they had done that before
>> switching the site over. Now that they are both types of characters in
>> the database, I’m afraid it’s too late.
>
> It would have required the site be offline for longer, and the return was
> determined not to be sufficient to warrant the downtime, given the
> workaround that works for getting at the older posts and that things
> eventually will get to a point where switching won’t be necessary.

Can it not be done in background?

However, he is talking of new posts not in UTF-8


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On Thu, 26 May 2011 10:20:07 +0000, Carlos E. R. wrote:

> Can it not be done in background?

Apparently not, my understanding is it requires the UI be taken offline,
otherwise indexing can get messed up.

> However, he is talking of new posts not in UTF-8

That’s up to the individual user - I am pretty certain we reset the
default to English UTF-8, but if someone has changed it back from UTF-8
and then posted messages in the non-UTF-8 codepage, this would happen.

Jim


Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

Some tips from my wife:

One thing that might help for new posts is setting the accept-charset attribute on the form element used to post messages. If you add something like:

<form action="stuff.php" class="foo" ... accept-charset="UTF-8"> ... </form>

then a well-behaved browser should post the form contents in UTF-8, regardless of the encoding that the user has chosen for displaying the page.

Another trick on the end-user side is using the browser itself to display the page in an encoding other than the one specified by the page. In Firefox, in the View menu, under “Character Encoding”, you can switch between UTF-8, Western, etc. So if you arrive on a page with garbage characters in a message, you can quickly switch the encoding to read it, without changing your preferred display encoding in the forum, and without reloading the page.

On 2011-05-26 22:20, Jim Henderson wrote:
> On Thu, 26 May 2011 10:20:07 +0000, Carlos E. R. wrote:
>
>> Can it not be done in background?
>
> Apparently not, my understanding is it requires the UI be taken offline,
> otherwise indexing can get messed up.

That’s very unfortunate.

>> However, he is talking of new posts not in UTF-8
>
> That’s up to the individual user - I am pretty certain we reset the
> default to English UTF-8, but if someone has changed it back from UTF-8
> and then posted messages in the non-UTF-8 codepage, this would happen.

That’s the problem “please try again” was refering to, if I understood him
correctly.


Cheers / Saludos,

Carlos E. R.
(from 11.2 x86_64 “Emerald” at Telcontar)

On Thu, 26 May 2011 23:36:02 +0000, please try again wrote:

> Some tips from my wife:
>
> One thing that might help for new posts is setting the accept-charset
> attribute on the form element used to post messages. If you add
> something like:
>
>
> Code:
> --------------------
> <form action=“stuff.php” class=“foo” … accept-charset=“UTF-8”> …
> </form>
> --------------------
>
>
> then a well-behaved browser should post the form contents in UTF-8,
> regardless of the encoding that the user has chosen for displaying the
> page.
>
> Another trick on the end-user side is using the browser itself to
> display the page in an encoding other than the one specified by the
> page. In Firefox, in the View menu, under “Character Encoding”, you can
> switch between UTF-8, Western, etc. So if you arrive on a page with
> garbage characters in a message, you can quickly switch the encoding to
> read it, without changing your preferred display encoding in the forum,
> and without reloading the page.

Thanks for the tip, I’ll pass that along - not sure if we’ll implement
it, because that becomes a customization to the code that would then need
to be maintained, which someone would have to decide to do. The benefit
might well be worth it, though, at least in the short term.

Jim


Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C