Python, unicode, utf8

Hi all,

I have a big mess in my brain right now so I thought I’d ask and hope anyone can help on this.

The thing comes as follows:

I get a file in plain text format, with a layout similar to this (without quotes) in every line:

“123456 article category description of article 789,01”

That file is encoded with ibm850 coding (go figure).
Includes accents and other special characters.
What I’ve done is convert it to utf-8 and to csv format, so I can import it with a Python/Qt/PyQt app.

The lines of the file look like this after that:

“123456”,“article category”,“description of article”,“789,01”

The problem now appears when importing it to the Qt app and showing the information in a Qt widget, QListBox for example.
There, when a special character, not included in ASCII coding, appears in a line an ASCII coding error pops up.

Hence the question:
What would be the best, easiest, more standard and pythonic way to solve/treat these encoding issues?

Hi,

Take a look at this url:

Python Unicode Tutorial

They seem to have dealt with a similar problem as yours.

Regards.

Hey, thanks a lot carboncore. I think I looked everywhere but in reportlab, which I’m also using btw.

I’ll go through that and hope I find some new info there.