Results 1 to 6 of 6

Thread: Please help on gibberish character problem

  1. #1
    Join Date
    Jan 2016
    Location
    Manchester, UK
    Posts
    286

    Default Please help on gibberish character problem

    Hello everyone,

    I need some help on determine and solve an encoding problem with filenames in Simple Chinese characters in my system.

    I got a .zip file with the name in Simple Chinese (SC) characters. And it contains several documents with names in SC, too.

    After unzipping the .zip file, I got a folder with name in unreadable characters when it should be SC characters. And all the files in it appear in unreadable names, too. However, when I unzip the .zip file in Windows, both of the folder name and the names of the files in it are shown in correct SC characters.

    I am using KDE Plasma 5 with openSUSE Tumbleweed. Currently, the system is with an English locale as the preferred language. But situations are the same even when I set SC as the preferred language in Regional Settings. And I use Dolphin and Ark for browsing and unzipping the file(s).

    But Dolphin and Ark can handle SC characters well when I perform zip/unzip operation in Linux. And sometimes, they can handle the files correctly from Windows as well. But I have no idea what's the difference because the files are from different colleagues.

    I want Dolphin and Ark to handle zip files copied from Windows correctly, but I know little about encoding/decoding. Could anyone please help me understand the problem or provide me with a solution?

    In case it helps, I uploaded an example here: http://s000.tinyupload.com/index.php...08858236658015

    Regards,
    cnzhx
    openSUSE Tumbleweed (usually the latest snapshot) w/ KDE Plasma 5

  2. #2
    Join Date
    Feb 2016
    Location
    Berlin
    Posts
    357

    Default Re: Please help on gibberish character problem

    your zip file is marked as malicious in chrome and cannot be downloaded. (*NOT GOOD*)

  3. #3
    Join Date
    Feb 2016
    Location
    Berlin
    Posts
    357

    Default Re: Please help on gibberish character problem

    at a guess linux expects utf-8. your collegue maybe has old (xp?) or something and it is encoded as GB2312, or other. UTF-8 is the saviour from the mess of **** encodings and was written by rob pike of unix/system9/golang fame.

  4. #4
    Join Date
    Jan 2016
    Location
    Manchester, UK
    Posts
    286

    Default Re: Please help on gibberish character problem

    Quote Originally Posted by ndc33 View Post
    your zip file is marked as malicious in chrome and cannot be downloaded. (*NOT GOOD*)
    Sorry for the inconvenience. I thought the first result from Google search of "file hosting" could be trusted. Here is the share link from my Google Drive: https://drive.google.com/file/d/0BzH...ew?usp=sharing

    Quote Originally Posted by ndc33 View Post
    at a guess linux expects utf-8. your collegue maybe has old (xp?) or something and it is encoded as GB2312, or other. UTF-8 is the saviour from the mess of **** encodings and was written by rob pike of unix/system9/golang fame.
    Your guess sounds very reasonable. Do you happen to have a solution on this? Or if not, should it be fixed in Ark or Dolphin, or somewhere else? I could not figure out the mechanism by which applications deal with character encoding.
    openSUSE Tumbleweed (usually the latest snapshot) w/ KDE Plasma 5

  5. #5
    Join Date
    Feb 2016
    Location
    Berlin
    Posts
    357

    Default Re: Please help on gibberish character problem

    encoding comes up all the time in e.g. python programming but i know not in OS context. I dont think it is so much an opensuse problem as a file/encoding, linux/encoding, kde/encoding problem - so spread your search. e.g. a quick google of "open zip file linux encoding" gives: http://unix.stackexchange.com/questi...h-hebrew-names (which actually recommends some bash commands).

  6. #6
    Join Date
    Jan 2016
    Location
    Manchester, UK
    Posts
    286

    Default Re: Please help on gibberish character problem

    Quote Originally Posted by ndc33 View Post
    encoding comes up all the time in e.g. python programming but i know not in OS context. I dont think it is so much an opensuse problem as a file/encoding, linux/encoding, kde/encoding problem - so spread your search. e.g. a quick google of "open zip file linux encoding" gives: http://unix.stackexchange.com/questi...h-hebrew-names (which actually recommends some bash commands).
    Thanks a lot ndc. It looks like I was digging toward the wrong direction. Based on the explanation in the post you provided, it is not a system / application problem, but a problem between systems / applications. If I am not getting this wrongly, without knowing the original encoding, it is not possible to decoding the filenames easily.

    I tried several known encoding, such as gb2312, gb18030, big5, gbk, without success. I guest the most convenient way is to unzip them in Windows.
    openSUSE Tumbleweed (usually the latest snapshot) w/ KDE Plasma 5

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •