Results 1 to 5 of 5

Thread: what character set is being used here (pic)

  1. #1
    Join Date
    Jun 2008
    Location
    Brisbane, Australia
    Posts
    207

    Default what character set is being used here (pic)

    My father has a tonne of Spanish music, and much of it has strange characters (non-ascii). He uses windows XP pro and I am not sure if it uses UTF-8 by default or not.

    In anycase, I am trying to better understand UTF-8, character sets (et al) because I am trying to back up all of his music using rsync to a FreeNAS box. The file names get mangled, and I have run out of ideas. I've tried a bunch of things that I won't attempt to list right now. But nothing has worked. The files, instead of returning like "this is a song.mp3" come back as "this is a s<#033>ng.mp3" if the letter o in 'song' where a special character (non-ascii).

    SO.....................

    the other day I'm working on a bunch of his files, and the filenames are mangled. I don't know what charset (character set) the files names are encoded (?) in. One of the files names includes a which I added deliberately in opensuse to "test" it to see if it would render find in Konsole or not.

    Ideas, comments (ps I was not sure where to post this...)
    NVIDIA! Listen to your customers! We want Free drivers.
    Petition #1. Petition #2. Use your VOICE! Sign the petitions!

  2. #2
    Join Date
    Jun 2008
    Location
    West Yorkshire, UK
    Posts
    3,448

    Default Re: what character set is being used here (pic)

    Firstly XP does not use UTF-8 by default; if it is a Spanish version of XP it probably uses a particular encoding which is not directly compatible with most English language encodings or even Latin-1.

    That would explain the filename problem.

    If you can identify the encoding and extract the text into a text file, OpenOffice may be able to convert it to UTF-8.

    The only way to deal with the filenames is to change them all to ASCII which would lose any Spanish characters in them. (You can re-encode them in UTF-8 once you have transferred the files.)

  3. #3
    Join Date
    Jun 2008
    Location
    Brisbane, Australia
    Posts
    207

    Default Re: what character set is being used here (pic)

    if it is a Spanish version of XP
    It is a common en_US version of Win XP
    If you can identify the encoding
    This is what I am trying to get help on. How can I detect the encoding of the filenames?
    The only way to deal with the filenames is to change them all to ASCII
    How does one accomplish this? And what would the letter become? Underscores, or hashes with digits (eg. #233), or question marks?
    You can re-encode them in UTF-8 once you have transferred the files.)
    After ASCII-ing all the filenames, converting them to UTF-8 i think will not benefit me, would it?

    thank you
    NVIDIA! Listen to your customers! We want Free drivers.
    Petition #1. Petition #2. Use your VOICE! Sign the petitions!

  4. #4

    Default Re: what character set is being used here (pic)

    As far as I know, Windows uses UTF-16 by default.

  5. #5
    josemx NNTP User

    Default Re: what character set is being used here (pic)

    the charset is iso8859-1 try mounting with the option iocharset=iso8859-1 and the filename would be displayed correctly

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •