Wrong charset in apache

Hi there.

I’m having trouble with the charset on my OpenSuSE machine. For a long time I’d used OpenSuSE 9 to monitor my network with the Just For Fun Network Managemment System, or JFFNMS.

I’ve got a corrupted hard disk and had to rebuild the system. Only that now I’m using the OpenSuSE 11. It work very well and it’s even using less RAM that it used to.

But, I’m getting these errors on the charset of the pages:

Bellow, is the page as it normally loads, with the browser using Ocidental ISO-8859-1 Charset.

Now, the page using UTF-8 Charset:

Here’s the source code of the page:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Frameset//EN” “http://www.w3.org/TR/html4/frameset.dtd”>
<!–
/*

  • Copyright (C) <2002-2005> Javier Szyszlican <javier@szysz.com>
  • This file is part of JFFNMS.
  • JFFNMS is free software; you can redistribute it and/or modify
  • it under the terms of the GNU General Public License as published by
  • the Free Software Foundation; either version 2 of the License, or
  • (at your option) any later version.
  • JFFNMS is distributed in the hope that it will be useful,
  • but WITHOUT ANY WARRANTY; without even the implied warranty of
  • MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  • GNU General Public License for more details.
  • You should have received a copy of the GNU General Public License
  • along with JFFNMS; if not, write to the Free Software
  • Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
    */
    –>
    <html>
    <head>
    <title>
    JFFNMS
    </title>
    <meta HTTP-EQUIV=‘Pragma’ CONTENT=‘no-cache’>
    <meta HTTP-EQUIV=‘Content-Type’ CONTENT=‘text/html; charset=iso-8859-1’>
    <meta HTTP-EQUIV=‘Content-language’ CONTENT=‘en’>
    <meta HTTP-EQUIV=‘Content-Style-Type’ CONTENT=‘text/css’>
    <meta HTTP-EQUIV=‘Content-Script-Type’ CONTENT=‘text/javascript’>

<link rel=‘shortcut icon’ type=‘image/x-icon’ href=‘/images/jffnms.ico’>
<link rel=‘icon’ type=‘image/x-icon’ href=‘/images/jffnms.ico’>
<link rel=‘stylesheet’ href=‘/default.css’>
<link rel=‘alternate’ type=‘application/rss+xml’ title=‘JFFNMS - 0.8.4 Events Feed’ href=‘/events.php?view_type=rss’>
</head>
<frameset rows=‘20,’ frameborder=‘no’ framespacing=‘0’>
<frame id=‘controls’ name=‘controls’ noresize scrolling=‘no’ src=‘controls.php?’>
<frameset id=‘menu_frame’ cols='
,0’ frameborder=‘no’ framespacing=‘1’>
<frame name=‘work’ noresize scrolling=‘yes’ src=‘blank.php?’>
<frame name=‘menu’ scrolling=‘no’ src=‘admin/menu.php?’>

</frameset>
</frameset>
</html>

The JFFNMS is configured as a VirtualHost, so I have add “AddDefaultCharset iso-8859-1” in the /etc/apache2/vhosts.d/jffnms.conf

Since it uses PHP, I uncommented the line
default_charset = “iso-8859-1”

Any help would be appreciated.

I see this:

<meta HTTP-EQUIV=‘Content-Type’ CONTENT=‘text/html; charset=iso-8859-1’>

The page declares to send all characters in latin1 encoding but then sends UTF-8 in reality. The browser obeys and faithfully tries to display latin1.

Two solutions: either you change the HTTP-EQUIV line and declare that you will be sending utf-8 or you change the other end (PHP) and make it to send iso-8859-1 (a.k.a. latin1).

I have not the slightest idea how the latter is done but I see a possible cause for your problem: SuSE 9 used the iso-8859-1 encoding by default but this has changed since then. Now the default is utf-8. You can try to switch the whole machine back to latin1 using yast2. The setting is found under ‘Languages’.

I can give you no guarantee that it will not break other things.

Hi again!

Voodoo, thx for your answer.

Well, I would like to know where I can change the global charset for the OpenSuSE.

On the Yast2 -> Languages: Primary Language already is Portuguese(Brazilian).
On DETAILS:
Locale Settings for Root : CTYPE ONLY
Use UTF-8 Encoding Is Not Marked
Detailed Locale Setting pt_BR

I really don’t know from where this UTF-8 coding is comming.

I think you should go with UTF-8 unless you have lots of existing text files that have Latin-1 encoding. Even so it may be easier to convert those. UTF-8 is the way to go in future.

You say that your content is generated with PHP. This could help: PHP charset/encoding FAQ - Kore Nordmann - PHP / Projects / Politics. If PHP is fetching contents from a databes make sure that this DB is also delivering strings in the right encoding.

Hi!

The JFFNMS is fetching the content from a MySQL DB. I’ve already looked into the database and it’s 8859-1 coded.

I’ve looked into the PHP code, and find where it tells what charset to use, and change it to UTF-8, but it still load the wrong charset and the coding loaded on the browser still is ISO-8859-1

I’m almost giving up…

Changing the charset to UTF-8 (for the connection, database, data) in MySQL does nothing useful in your case. In fact I have the reverse situation for some web apps, the default charset is Latin-1 because it’s an older MySQL version, and yet I store UTF-8 data fine in it. What it does affect are things like collation functions in MySQL which I don’t use.

If you wanted to go to UTF-8 throughout you would have to convert all the text data in MySQL from Latin-1 to UTF8. It can be done, and you can find tutes on how to do it if you search.

However you may just wish to not bite into that problem now and just tell the web browser that the data is in Latin-1. This can be done on a per page basis using the Content-Type header in HTTP. Your page should output Content-Type: text/html; charset=iso8859-1. This can be done using the header() function in PHP. Because headers have to be output before the body, you must invoke header() before any HTML is output by the PHP code. Typically at the top of the script. For example in my PHP code, I have:

header('Content-Type: text/html; charset=utf-8');

as the very first line in the PHP file. In this example, I am using utf-8 in my web app. You would put iso8859-1.

You can use the web browser to look at the page info to see that it has output the correct Content-Type declaration.

This is independent of the OS’s or Apache’s default charset since your data is stored in MySQL. However if you are outputting any text from files or even from string literals, then those must also be in iso8859-1. You can do this by changing the charset in vim when you edit the file. The OS will happily store it unchanged, files are just sequences of octets in Linux.

Hi Ken

Well, it worked, but not the way you described. It worked when I put just like you, UTF-8, forcing the browser to read on this coding.

I’ve looked on to all the PHP files using a script and in all of them, changed from UTF-8 to ISO-8859-1. It doesn’t work.

I really don’t know why, but it seems to me that the OS, or the apache, is forcing the code to UTF-8, and I just have not found where the instruction is to change it

I’ll be searching and if I found where, I’ll post here.

Thank you very much, Ken and Vodoo for your support. Really appreciate it.