Menu

#12 Could not determine number of pages

v1.0 (example)
closed
nobody
None
5
2018-08-01
2016-04-18
No

Hi,

I've just updated the version of pdfsandwich from 0.1.3 to 0.1.4 on Mac OS X 10.7 and Fedora 23. That is, I ran svn update and then make and make install.
Now I get this error on both machines and pdfsandwich is doing nothing:
Fatal error: exception Failure("Error: Could not determine number of pages of file /var/folders/v3/gypkpph90tv1vr56dn_q_xc40000gr/T/pdfsandwich_inputfiled3c818.pdf")

The file reported in the error message exists and seems to be a link to the pdf file that should be OCR-ed. But I can't open it and it might actually be that the link is wrong, because it points to the .pdf file as if it were in the tmp folder which it isn't.

Discussion

  • Tobias Elze

    Tobias Elze - 2016-04-18

    Oh, sorry to hear. If you urgently need 0.1.4, you may want to download the sources from sourceforge instead of checking out via svn, as I'm currently fixing some other bugs, and this bug seems to be the side effect of one of my bug fixing attempts. But I'm optimistic that we can fix this quickly.

    Could you please tell me the exact command line call which led to this error?

    Thanks,
    Tobias

     
  • silberzwiebel

    silberzwiebel - 2016-04-18

    Hi, thanks for the quick reply. I do not need version 0.1.4 urgently, so I just checked out revision r49 via svn and it works again. (I'm just too lazy, to manually download things ;))

    Could you please tell me the exact command line call which led to this error?

    I did not use any options to get the error, just this:
    pdfsandwich He\ \&\ Kowler\ 1991.pdf

    (I also tried with a pdf file without spaces, but it didn't work either).

     
  • Tobias Elze

    Tobias Elze - 2016-04-18

    Okay, it should be fixed now. Could you try it out?

    Thanks,
    Tobias

     
  • silberzwiebel

    silberzwiebel - 2016-04-19

    I tried with another different PC (also Fedora 23), and the OCR seems to work, as it takes some time and corresponding progress messages are written on the console.
    But, the last step fails with this error:

    OCR done. Writing "He & Kowler 1991_ocr.pdf"
    Fatal error: exception Unix.Unix_error(Unix.EXDEV, "rename", "/tmp/pdfsandwich_output11c47f.pdf")
    

    and I do not get the output file at the location where I started pdfsandwich.
    The complete output file however is in the /tmp folder.

    (Offtopic: The main reason I lately updated pdfsandwich was that one specific file resulted in badly readable text in the OCRed file (looks like low resolution). The original file looks good, but has no OCR in it. What options could I use to get good output? I already tried -resolution 500 and -noimage with no success.)

     
    • Tobias Elze

      Tobias Elze - 2016-04-22

      OCR done. Writing "He & Kowler 1991_ocr.pdf"
      Fatal error: exception Unix.Unix_error(Unix.EXDEV, "rename", "/tmp/pdfsandwich_output11c47f.pdf"

      Thanks for noting this, I fixed that now. Feel free to try it out.

      one specific file resulted in badly readable text in the OCRed file (looks like low resolution). The original file looks good, but has no OCR in it. What options could I use to get good output?

      -noimage will work only together with hocr2pdf and will definitely not solve your problem. The first thing to try out is to skip pre-processing by unpaper (Option: -nopreproc), because sometimes unpaper messes things up. Does that help anything? Feel free to send one of these pages directly to me so that I can have a look.

      Tobias

       
  • Tobias Elze

    Tobias Elze - 2016-08-05
    • status: open --> closed
     
  • Deepak Keswani

    Deepak Keswani - 2018-08-01

    I have installed pdfsandwich on Ubuntu and I'm trying to execute below command for .tif (Multipages tif file) to .pdf file and it throws below error message.

    Can you please help me on this?

    $ /usr/bin/pdfsandwich -verbose -lang spa+eng+fra Sample_3_Multi_page.tif -o Sample_3_Multi_page.pdf
    pdfsandwich version 0.1.4
    Checking for convert:
    convert -version
    Version: ImageMagick 6.8.9-9 Q16 x86_64 2018-07-10 http://www.imagemagick.org
    Copyright: Copyright (C) 1999-2014 ImageMagick Studio LLC
    Features: DPC Modules OpenMP
    Delegates: bzlib cairo djvu fftw fontconfig freetype jbig jng jpeg lcms lqr ltdl lzma openexr pangocairo png rsvg tiff wmf x xml zlib

    Checking for unpaper:
    unpaper -version
    6.1
    Checking for tesseract:
    tesseract -v
    tesseract 3.04.01
    leptonica-1.73
    libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

    Checking for gs:
    gs -v
    GPL Ghostscript 9.18 (2015-10-05)
    Copyright (C) 2015 Artifex Software, Inc. All rights reserved.
    Input file: "Sample_3_Multi_page.tif"
    Output file: "Sample_3_Multi_page.pdf"
    Fatal error: exception Failure("Error: Could not determine number of pages of file Sample_3_Multi_page.tif")

    Thanks.

     

Log in to post a comment.

MongoDB Logo MongoDB