Can you recommend some Open Source software to archive 60 yrs of data. The data are surgical patients’ files, maybe 2-3 pages per patient. The brief is to turn them into paperless records and destroy the original paper.
In my simplistic way I was thinking of scanning each set of 2-3 pages into openoffice writer and exporting that document to a pdf file. That would be the record. Then I would make an openoffice database with
Name
Date of Birth
2-3 other details that doctors need
filename of the pdf file
That should do it.
Is there an existing paperless-office / archiving-software in the Open Source world. I see no reason to invent something if a better one already exists.
GNUmed looks pretty good since it has a patient details database and an included document manager. So it likely can be used as an archive, which is what I want.
What I am doing has nothing to do with record keeping, but something I found useful.
At work I discovered a copier with “scan and mail PDF to me” feature. I have been using it to scan old documents I have collected over the years. It’s great, getting rid of all that clutter.
Here’s what I would suggest: Get a desktop scanner supported by Linux. Write a little GUI or web app to interface to CLI scanning and conversion to PDF programs. Or somehow script GIMP to do this. I imagine you would have sane feeding into something to PDF converter. The GUI would also prompt for that info you talked about. You could write it to a MySQL DB or just to plain text files as key-value pairs. You might also want to embed some or all of that info in the PDF document dictionary so that it also stays with the document.
Then let the staff loose on this setup. Don’t underestimate the amount of time required to do the manual page turning and positioning on the flatbed. One nice thing about the copier is that for long documents I can use the feeder.
Estimating the time is crucial; just for example: Suppose 20000 records @ two pages to scan per record @ 5 seconds head-travel per scan – that’s 200000 seconds with a fast scanner – hmmm 60 hours without allowing for drawing a breath, just for scan head movement time. I think a fast scanner with sheet feeder is essential.
You might also want to take into account the state of the paper. If some of them are 60 years old, they might not take kindly to being handled by the feeder and cause misfeeds or jams, etc. slowing down scanning. A couple of my documents wouldn’t totally feed and I had to scan a couple of pages by on the flatbed and join up the PDFs. So a quality feeder may make good sense. Experiments are needed.
I have used HP Director for Windows which allows you to save multiple page documents to PDF directly. I haven’t found a direct Linux alternative but I haven’t looked recently. However, HP Director was distributed with a lot of HP printer/scanners and should run under WINE.
It’s actually ppl born after 1920 who exist on paper. Databases were used for the last 10 yrs and a mixture before that, so it’s perhaps better expressed as 60-80 years of historical data.
It’s a good idea: looking at the windows software that comes with the scanner to see if it has the HP director style of software. Nearly all of them do now days have the ability to scan direct to pdf – in windows. Thanks for the thought. I’ll check it out.
And another thing I thought of: before Australia went metric, imperial paper sizes like foolscap were used. If you have documents on that size, make sure the scanner can handle it.
If the volume of documents is large enough, it may make sense to lease one of those new-fangled copier/scanners like the one in my office for the duration of the digitisation effort. Connect up the scanner to the LAN and have it mail the PDFs to a special account on the Linux server, then write some scripts to postprocess the mail and prompt data entry workers to enter the metadata needed.
>
> ken_yap;1971496 Wrote:
>> And another thing I thought of: before Australia went metric, imperial
>> paper sizes like foolscap were used. If you have documents on that size,
>> make sure the scanner can handle it.
>>
>> If the volume of documents is large enough, it may make sense to lease
>> one of those new-fangled copier/scanners like the one in my office for
>> the duration of the digitisation effort. Connect up the scanner to the
>> LAN and have it mail the PDFs to a special account on the Linux server,
>> then write some scripts to postprocess the mail and prompt data entry
>> workers to enter the metadata needed.
> It just gets easier and easier :eek:
If you can use it, some of the commercial outfits like Kinko’s can scan
documents onto cd/dvd. Some franchises have really nice equipment and the
charge is pretty reasonable - considering the alternatives. I don’t know
about OCR and such - we elected to have several years worth of sermons and
records scanned to DVD just for backup so images were adequate fro our
purposes - but even images could be OCR’d and scripts would save a lot of
finger time.
Thanks Will.
I’ll be the subcontractor. So of course I won’t be further outsourcing anything that I don’t have to. So I would look first to in house solutions. But your idea is good if we get overwhelmed or the finish horizon looks too far away.
>
> Hi
>
> Can you recommend some Open Source software to archive 60 yrs of data.
> The data are surgical patients’ files, maybe 2-3 pages per patient. The
> brief is to turn them into paperless records and destroy the original
> paper.
>
> In my simplistic way I was thinking of scanning each set of 2-3 pages
> into openoffice writer and exporting that document to a pdf file. That
> would be the record. Then I would make an openoffice database with
>
> - Name
> - Date of Birth
> - 2-3 other details that doctors need
> - filename of the pdf file
>
> That should do it.
>
> Is there an existing paperless-office / archiving-software in the Open
> Source world. I see no reason to invent something if a better one
> already exists.
>
> Thanks
> Swerdna
>
>
Swerdna;
Before original copies are destroyed, your client needs to be aware of the
need to continually update the storage of these records as technology
advances. About 15 years ago there was an article in the “Communications of
the ACM” that claimed that the US census data for 1960 had been preserved on
magnetic tape (state of the art for the time) but of course by the 90’s there
was no longer any equipment that could read this data, albeit the tapes were
assumed to be well preserved. The cost of re-manufacturing the tape drives
needed would not be cost effective, thus the only access to this data was
from what had been abstracted and published on paper.
P. V.
“We’re all in this together, I’m pulling for you.” Red Green
> On Fri April 10 2009 04:56 am, swerdna wrote:
>
>>
>> Hi
>>
<snip>
>>
> Swerdna;
>
> Before original copies are destroyed, your client needs to be aware of the
> need to continually update the storage of these records as technology
> advances. About 15 years ago there was an article in the “Communications
> of the ACM” that claimed that the US census data for 1960 had been preserved
> on magnetic tape (state of the art for the time) but of course by the 90’s
> there was no longer any equipment that could read this data, albeit the
> tapes were
> assumed to be well preserved. The cost of re-manufacturing the tape drives
> needed would not be cost effective, thus the only access to this data was
> from what had been abstracted and published on paper.
One additional thought, in the late 70’s early 80’s we were saving data on 8"
floppies and removable disk packs. I wonder if you can find the equipment to
read these outside a museum.
P. V.
“We’re all in this together, I’m pulling for you.” Red Green
Fascinating – from paper to tape to paper is all that really survived.
One additional thought, in the late 70’s early 80’s we were saving data on 8"
floppies and removable disk packs. I wonder if you can find the equipment to
read these outside a museum.
Yes, I remember the big floppies and the even bigger earlier floppies.
In this case there will be no interest in the pdf records progressively as time goes by. The older folk will pass on first, progressively, and there will be no need for their records to exist beyond 7 years after their demise.
And the task is not infinite. These paper records are finite and are not being generated any more. Just the historical paper needs to be preserved for a while, digitally.
Just wanted to let you know that HP has a product called Digital Sender that can scan and email/ftp the multi-page PDF output to another machine/location. They talk about Windows only but I had used it in one of my projects. It sits on the network and we can easily configure it (it has a small keyboard and display). I never opened their CD containing Windows drivers and other software.