Results 1 to 9 of 9

Thread: Assistive technology

  1. #1
    Join Date
    Aug 2009
    Location
    iowa usa
    Posts
    72

    Default Assistive technology

    Assistive technology help

    I need to take .doc(word) to like file to( .pdf)

    put the (.pdf) trough optical character recognition to text file(.txt) keeping pages and paragraphs.

    File(.txt) to digital audio file split are paged by chapter/page/paragraphs

    os suse 11.1

  2. #2
    Join Date
    Aug 2009
    Location
    iowa usa
    Posts
    72

    Default Re: Assistive technology

    I got the .doc to go to pdf
    so how do you get a ocr program to pdf to text

  3. #3
    Join Date
    Jun 2008
    Location
    West Yorkshire, UK
    Posts
    3,415

    Default Re: Assistive technology

    OpenOffice>Export to PDF - Kooka (or other scanner program using SANE) - GOCR - OpenOffice to tidy up text file - not sure how you will do the last bit - depends on what is required.

  4. #4
    Join Date
    Aug 2009
    Location
    iowa usa
    Posts
    72

    Default Re: Assistive technology

    yes word is text!
    but some of the word in the files are in a pic box!
    that is where I am having troubles!

  5. #5
    Will Honea NNTP User

    Default Re: Assistive technology

    kc0hwa wrote:

    >
    > yes word is text!
    > but some of the word in the files are in a pic box!
    > that is where I am having troubles!


    AH! I was wondering why you didn't just save the pdf as text directly but
    the presence of text as part of an image makes for a whole new ball game if
    you want to extract the text from an image. The only solutions I can think
    of all involve an ocr of the source as an image, not as a text doc so this
    will be an interesting answer!

    --
    Will Honea

  6. #6
    Join Date
    Aug 2009
    Location
    iowa usa
    Posts
    72

    Default Re: Assistive technology

    1.) how do you grid like kurzweil in linux

    2.) I can not bring a pdf in to ocr! kooka

    3.) In my look up on this, I saw! IRS is know use insted of OCR any one know any thing on this!

  7. #7
    Join Date
    Jun 2008
    Location
    West Yorkshire, UK
    Posts
    3,415

    Default Re: Assistive technology

    re 2. AFAIK you can only scan single pages in Linux, not bulk. So you either need to print out the PDF and scan each page separately or convert the PDF to single images.

    In practice, to extract the text from a PDF I would never go this route; I would simply extract the text directly as a text file and any images as separate images and then reconstitute them.

    There is now the option in OpenOffice of adding the Sun PDF extension which allows you to load a PDF in Draw and create an ODT file directly from it.

    So one reason why you may be having difficulties is that there is no longer any reason for most people to take the route you are taking.

  8. #8
    Join Date
    Aug 2009
    Location
    iowa usa
    Posts
    72

    Default Re: Assistive technology

    how to take pdf to a picture(ex jpg png exex....)
    Im coping what text that are in the documents to a txt and just taking the pic in the file a moving to pdf
    now pdf to txt!
    so pdf--> (pic) --> ocr or IRS ---> txt

  9. #9
    Join Date
    Jun 2008
    Location
    West Yorkshire, UK
    Posts
    3,415

    Default Re: Assistive technology

    The simplest way of creating an image from a PDF is to open it in a viewer and take a screenshot. How sharp this will be for using with an OCR will depend on your screen resolution. Alternatively, print it out and then scan it as an image and not as text.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •