UTF-8 not rendered by kate for some files

I have a script that creates several files, each beginning with Zero Width No- Break Space ('EF BB BFX, BOM) and containing Hebrew text in UTF-8. When I open them in kate, some render correctly and some don’t. However, LibreOffice correctly renders a file that kate garbled.

As an example, in a file called katz.out, kate renders “H: הרב שְׁמוּאֵל יְהוּדָה בַּר צְבִי יַעֲקֹב הַכֹּהֵן” as “H: ה׹ב שְׁמו֌אֵל יְהו֌ד֞ה ב֌ַך שְב֮י יַעֲקֹב הַכֹ֌הֵן”.

Any idea what’s causing this?

You might narrow down the problem further by trying to create a problem in Kate and then again in another application, saving it and then opening in the other application to see if the problem is limited to creating or only reading the text, then submitting a bug to https://bugzilla.opensuse.org


If I do save as, LibreOffice correctly renders the copy and kate does not. I’ve submitted https://bugzilla.opensuse.org/show_bug.cgi?id=1176694

Can you try to switch to UTF-8 output in Kate? I do not use Kate so I am not sure if it is possible.

It appears that if a file begins with EF BB BF and is not valid UTF8, kate assumes Latin 9 but drops the first three octets instead of displaying them as ï](https://en.wikipedia.org/wiki/Ï)»](https://en.wikipedia.org/wiki/Guillemet)¿](https://en.wikipedia.org/wiki/Inverted_question_mark).