Results 1 to 4 of 4

Thread: How to show a preview of a PDF page in a HTML document?

  1. #1
    Join Date
    Jan 2009
    Location
    Switzerland
    Posts
    1,529

    Default How to show a preview of a PDF page in a HTML document?

    I need to create a web page showing a preview of a single PDF page. The PDF page is scanned from a document and contains image data.

    I know that I could use <iframe ...> and let the browser launch a PDF viewer, but I would rather prefer to convert the PDF to PNG (or jpg?) and let the browser display the preview.

    All this will be done with a cgi script written in bash on 11.3 (if that matters).

    Question: how would you do it? What are the pro's and cons of the different approaches?
    Technology is 'stuff that doesn't work yet.' -- Bran Ferren

  2. #2
    Join Date
    Feb 2010
    Location
    Germany
    Posts
    4,654

    Default Re: How to show a preview of a PDF page in a HTML document?

    vodoo wrote:

    >
    > All this will be done with a cgi script written in bash on 11.3 (if
    > that matters).

    You can use convert (imagemagick)
    Code:
    convert my.pdf my.png
    It will produce a png for every page in the pdf.
    For more fine grained control "man convert".

    >
    > Question: how would you do it? What are the pro's and cons of the
    > different approaches?
    >

    I think this is a good approach with the cgi script. It makes showing it
    independent from the user having a pdf viewer installed.

    --
    PC: oS 11.3 64 bit | Intel Core2 Quad Q8300@2.50GHz | KDE 4.6.3 | GeForce
    9600 GT | 4GB Ram
    Eee PC 1201n: oS 11.4 64 bit | Intel Atom 330@1.60GHz | KDE 4.6.0 | nVidia
    ION | 3GB Ram

  3. #3

    Default Re: How to show a preview of a PDF page in a HTML document?

    vodoo wrote:
    > I need to create a web page showing a preview of a single PDF page. The
    > PDF page is scanned from a document and contains image data.


    If what you're showing is basically a scanned image, why not show the
    scanned image? What advantage do you get from showing [a further
    conversion of] a PDF page?

  4. #4
    Join Date
    Jan 2009
    Location
    Switzerland
    Posts
    1,529

    Default Re: How to show a preview of a PDF page in a HTML document?

    I think this is a good approach with the cgi script. It makes showing it independent from the user having a pdf viewer installed.
    @martin_helm: thank you for the feedback. The png gives me better control to integrate the preview into the webpage the way I want it. Your opinion was very valuable for me. My question was in fact a question on design and not about the conversion process. More on this later.

    @djh-novell: I am reluctant to show the scanned PDF just as is for several resons. I have no control what PDF viewer will be used on the client side. As this is a preview I want to control the size of the image as well. And then the png is about half the file size of the equivalent PDF, saving a lot of bandwidth.

    As for the conversion process (this is more a report than a question, but feel free to comment):

    convert (which I have used before) is using gs (ghostscript) to load the PDF image. gs seems to have some problems reading PDF scans from Canon copiers/scanners. The result is:

    Code:
       **** Warning:  Generation number out of 0..65535 range, assuming 0.
       **** Warning:  File has an invalid xref entry:  2.  Rebuilding xref table.
    Processing pages 1 through 1.
    Page 1
    
       **** This file had errors that were repaired or ignored.
       **** The file was produced by: 
       **** >>>> Canon iR3045                     <<<<
       **** Please notify the author of the software that produced this
       **** file that it does not conform to Adobe's published PDF
       **** specification.
    Googling this shows that it is possibly a gs bug and not a problem of the scan. It could also be a bug in pdftk when it splits multipage scans into single pages (this is what I do; I have not investigated the problem). The scan is read without any problem by okular or evince. I can use pdftk to "repair" the scan. This makes the gs warning go away.

    Anyway, gs is producing a file encoded b/w with 1 bit per pixel. This is unuseable for scanned images. Same result when calling gs directly:

    Code:
    gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=pnggray -r300 -sOutputFile=my.png my.pdf
    A completely different result is achieved using xpdf:

    Code:
    pdftoppm -gray -png -scale-to-x 595 -scale-to-y 842 my.pdf my
    The conversion is slower, but there is no warning and the resulting image is of good quality. Bottom line: PDF2PNG conversion can be done and is reducing image file size compared to the original PDF scan.
    Technology is 'stuff that doesn't work yet.' -- Bran Ferren

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •