Please help on gibberish character problem

Hello everyone,

I need some help on determine and solve an encoding problem with filenames in Simple Chinese characters in my system.

I got a .zip file with the name in Simple Chinese (SC) characters. And it contains several documents with names in SC, too.

After unzipping the .zip file, I got a folder with name in unreadable characters when it should be SC characters. And all the files in it appear in unreadable names, too. However, when I unzip the .zip file in Windows, both of the folder name and the names of the files in it are shown in correct SC characters.

I am using KDE Plasma 5 with openSUSE Tumbleweed. Currently, the system is with an English locale as the preferred language. But situations are the same even when I set SC as the preferred language in Regional Settings. And I use Dolphin and Ark for browsing and unzipping the file(s).

But Dolphin and Ark can handle SC characters well when I perform zip/unzip operation in Linux. And sometimes, they can handle the files correctly from Windows as well. But I have no idea what’s the difference because the files are from different colleagues.

I want Dolphin and Ark to handle zip files copied from Windows correctly, but I know little about encoding/decoding. Could anyone please help me understand the problem or provide me with a solution?

In case it helps, I uploaded an example here: http://s000.tinyupload.com/index.php?file_id=00535108858236658015

Regards,
cnzhx

your zip file is marked as malicious in chrome and cannot be downloaded. (NOT GOOD)

at a guess linux expects utf-8. your collegue maybe has old (xp?) or something and it is encoded as GB2312, or other. UTF-8 is the saviour from the mess of **** encodings and was written by rob pike of unix/system9/golang fame.

Sorry for the inconvenience. I thought the first result from Google search of “file hosting” could be trusted. Here is the share link from my Google Drive: https://drive.google.com/file/d/0BzHP4DvFrhxbY2xxNmlJNHl2ajg/view?usp=sharing

Your guess sounds very reasonable. Do you happen to have a solution on this? Or if not, should it be fixed in Ark or Dolphin, or somewhere else? I could not figure out the mechanism by which applications deal with character encoding.

encoding comes up all the time in e.g. python programming but i know not in OS context. I dont think it is so much an opensuse problem as a file/encoding, linux/encoding, kde/encoding problem - so spread your search. e.g. a quick google of “open zip file linux encoding” gives: http://unix.stackexchange.com/questions/251969/how-can-i-correctly-decompress-a-zip-archive-of-files-with-hebrew-names (which actually recommends some bash commands).

Thanks a lot ndc. It looks like I was digging toward the wrong direction. Based on the explanation in the post you provided, it is not a system / application problem, but a problem between systems / applications. If I am not getting this wrongly, without knowing the original encoding, it is not possible to decoding the filenames easily.

I tried several known encoding, such as gb2312, gb18030, big5, gbk, without success. I guest the most convenient way is to unzip them in Windows.