Hi I am Rupesh from India and I have 2400 PDF files which are converted to djvu files using pdf2djvu utility. All the files are converted successfully without any warnings but I want to count whether all original pdf files and the converted djvu files have same number of pages or not. I just want to compare page count only and discard everything like quality etc.,.
I have converted PDF files to djvu files in order to save space and I am going to keep both types of files and use djvu files but if I found any error I will read the corresponding PDF file.
I have obtained page count of all pdf files using pdfinfo utility and stored in a text file. If I can obtain page count of all djvu files and store the output in text file I can compare two text files using diff utility.
The command used to count number of pages regarding djvu is djvused and I am able to get page count using this command.
Suppose when I issue the command
djvused -e n filename.djvu I am getting output with single line which contains
I have tried to save the output to a text file of the above command through redirection and succeded like
djvused -e n filename.djvu > output.txt
I want to obtain number of pages of every file and store the output in a text file in the following pattern
In order to achieve the above I have created a simple shell script which I am showing below
cd /run/media/root/Others/temp/converted\ djvu/ttd/Home/Download/ for f in *.djvu do echo "file name: $f" djvused -e n $f > output.txt done
On executing the above script I am able to get filenames of all the files but I am getting page count for only few files that too with errors.some of the output I am showing below
DJVUSED --- DjVuLibre-3.5.25 Simple DjVu file manipulation program Usage: djvused [options] djvufile Executes scripting commands on djvufile. Script command come either from a script file (option -f), from the command line (option -e), or from stdin (default). Options are -v -- verbose -f <scriptfile> -- take commands from a file -e <script> -- take commands from the command line -s -- save after execution -u -- produces utf8 instead of escaping non ascii chars -n -- do not save anything Commands -------- The following commands can be separated by newlines or semicolons. Comment lines start with '#'. Commands usually operate on pages and files specified by the "select" command. All pages and files are initially selected. A single page must be selected before executing commands marked with a period. Commands marked with an underline do not use the selection ls -- list all pages/files n -- list pages count dump -- shows IFF structure size -- prints page width and height in html friendly way select -- selects the entire document select <id> -- selects a single page/file by name or page number select-shared-ant -- selects the shared annotations file create-shared-ant -- creates and select the shared annotations file showsel -- displays currently selected pages/files . print-ant -- prints annotations . print-merged-ant -- prints annotations including the shared annotations . print-meta -- prints file metadatas (a subset of the annotations print-txt -- prints hidden text using a lisp syntax print-pure-txt -- print hidden text without coordinates _ print-outline -- print outline (bookmarks) . print-xmp -- print xmp annotations output-ant -- dumps ant as a valid cmdfile output-txt -- dumps text as a valid cmdfile output-all -- dumps ant and text as a valid cmdfile . set-ant <antfile>] -- copies <antfile> into the annotation chunk . set-meta <metafile>] -- copies <metafile> into the metadata annotation tag . set-txt <txtfile>] -- copies <txtfile> into the hidden text chunk . set-xmp <xmpfile>] -- copies <xmpfile> into the xmp metadata annotation tag _ set-outline <bmfile>] -- sets outline (bootmarks) _ set-thumbnails <sz>] -- generates all thumbnails with given size remove-ant -- removes annotations remove-meta -- removes metadatas without changing other annotations remove-txt -- removes hidden text _ remove-outline -- removes outline (bookmarks) . remove-xmp -- removes xmp metadata from annotation chunk _ remove-thumbnails -- removes all thumbnails . set-page-title <title> -- sets an alternate page title . save-page <name> -- saves selected page/file as is . save-page-with <name> -- saves selected page/file, inserting all included files _ save-bundled <name> -- saves as bundled document under fname _ save-indirect <name> -- saves as indirect document under fname _ save -- saves in-place _ help -- prints this message Interactive example: -------------------- Type % djvused -v file.djvu and play with the commands above Command line example: --------------------- Save all text and annotation chunks as a djvused script with % djvused file.djvu -e output-all > file.dsed Then edit the script with any text editor. Finally restore the modified text and annotation chunks with % djvused file.djvu -f file.dsed -s You may use option -v to see more messages
I think that it is showing manual page because something went wrong. may I know whats wrong have been done.
I have searched gui for djvused and found djvusmooth and installed it but unable to run it I am providing the errors below
linux-tg2q:~ # djvusmooth Traceback (most recent call last): File "/usr/bin/djvusmooth", line 20, in <module> from djvusmooth.gui.main import Application File "/usr/lib/python2.7/site-packages/djvusmooth/gui/main.py", line 28, in <module> import djvusmooth.dependencies as __dependencies File "/usr/lib/python2.7/site-packages/djvusmooth/dependencies.py", line 70, in <module> _check_djvu() File "/usr/lib/python2.7/site-packages/djvusmooth/dependencies.py", line 45, in _check_djvu python_djvu_decode_version, ddjvu_api_version = djvu_decode_version.split('/') ValueError: need more than 1 value to unpack linux-tg2q:~ #
If djvusmooth works properly is it possible to create a text file which contains filename and its page count.
Please suggest how to save the page count information including file name of all the djvu files and store the information in text file.