After scanning about 300 pages, gscan2pdf crashes with "too many open files" errors
Brought to you by:
ra28145
On my several Linux Mint Systems (21.1), after scanning of 250-300 pages, gscan2pdf crashes with a message saying "too many open files".
I already made sure, to have
ulimit -Sn set to 20000
ulimit -Hn set to 1048576
I have no idea what else should be configured to avoid these crashes.
I am in the process of completely rewriting how the pages are stored. This should solve this problem. However, this is not a trivial change and will take some time.
Thanks for your swift answer, however, I am in the need to scan dozens of data book for Computer History Museum/bitsavers.
Is there any quick workaround you could recommend for the time being?
Let me know, if I can help you.
PS. I do use the latest version gscan2pdf 2.13.2 on a modern Linux Mint system with 20 GB RAM.
I would scan in batches of perhaps 100 pages and use the "append to PDF" option.
Uh, I always had unfortunately overlooked these two save options! Thanks for pointing out. (I used pdftk outside gscan2pdf, but using the built-in function is much more convenient, of course.)
Good luck by refactoring the code to overcome the "too many open files" problem - I am really looking forward.
Last edit: Wikinaut 2023-01-22
Question: do you have an idea, why the(my) expressed setting of ulimit (to a value of 20000), why this did not fix the "too many open file issue"?
No clue why changing ulimit didn't work. Perhaps it isn't supported everywhere.
Further remark:
I noticed:
After having finished scanning of about 140 pages, properly saving as pdf and properly closing gscan2pdf - without crashes - I noticed in /tmp many files
"/tmp/brscan_jpeg_PAGEnn_hash"
which were apparently stemming from the scan process via SANE and the brother driver. I scanned with 600dpi/True Gray, the files are about 8-10 MB big.
I am telling you this, because perhaps gscan2pdf does not properly close/delete the the scan job/s - just as an idea what can be wrong. So in my case there are 1,4 GB of these brscan-files in /tmp for the last session of gscan2pdf (as said: without errors).
Is this a valuable information?
As you point out yourself, those files are being created by the Brother driver, not gscanp2df, and not in the temporary directory that gscan2pdf uses. So I don't know how gscan2pdf should know that they can be deleted. Perhaps it is worth filing a bug with Brother.
Last edit: Jeffrey Ratcliffe 2023-01-26
Jeffrey, sorry, I meant, that eventually gscan2pdf is not properly shutting down a scan job it started, just an idea.
It would be interesting to see whether they are left after scanimage uses the scanner. Can you try that?