Expanding the source for Help Centre


I have a great amount of manuals and tutorials which just have piled up in a folder.
Now, I would really like to have these documents available for ‘KDE Help Centre’, or any similar application, and thus have the possibility to search within the documents all at once.

Would anyone give me a suggestion of which and how?


  1. Do you have a Linux compatible flatbed scanner?
  2. Do you have the time to spend?
  3. Do you know how to use something like gimp for images
  4. and something like ocrrd to convert scan to text (Optical Character Recognition)
  5. some knowledge of making simple web pages

If you answer yes to these, making a browsable document database is possible. You can for example use openoffice to read in and format text in html format, add the images you separated from the scannings and links to the other pages as you go.

If you also know some script writing you can also automate some of the tasks for speed.

What format are the files in? PDF’s can be open with openoffice, I
think the KDE help center can import chm files, can you check and
identify what can be imported (I use Gnome)?

Cheers Malcolm °¿° (Linux Counter #276890)
SUSE Linux Enterprise Desktop 11 (x86_64) Kernel
up 15 days 14:52, 4 users, load average: 1.12, 1.04, 0.74
GPU GeForce 8600 GTS Silent - CUDA Driver Version: 195.36.15

I’m sorry, wrong frame of mind. I was thinking printed text not digital forms. KDE help centre does not have an import function only searches so adding to the center will take some doing.

Thanks for your replies.

I will check what is available to import. However, could I create a database from those files, create a search index and use a different kind of application, I may do that, if you know of some alternatives?

The files are mostly pdf, and I have about 1GB of them so converting all would be tiresome work. But, it is an option as I also have a motivation for learning how to do this in general. What about doxygen or docbook, what do those do?


Did some reading, KDEhelpcenter uses a directory structure in /usr/share/doc/kde4/html and basically you add directories here for new manuals. An entry in the folder <appname>.docbook defines the loose structure of your new help that outlines the chapters, indexes, etc. I looked at one and it appears to be XML. Anyway, when you use build in KDEhelpcentre (takes alot of time) the docbook file is read and KDE specific help is generated.

doxygen is a structured documentation layout system but it takes a series of input files (config, text, images, headers, footers) and outputs XML or latex to pdf or postscript output.

I am starting out the gate the same as you … have about 30GB of currently unsearchable documentation in hundreds of formats, paired with a wall of printed resource material. Thusly, it would be nice to finally have a good working solution. I think the fastest way might be to use something like openoffice to create a set of html category, subcategory indexes with the entries being the names of your pdf documents. But this doesn’t satisfy the search mode.

Thanks techwiz, thats a starting point. And good luck with your task.