pdfgrep-users Mailing List for pdfgrep
Brought to you by:
rootzlevel
You can subscribe to this list here.
2012 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(2) |
Nov
(4) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(3) |
Sep
(1) |
Oct
(2) |
Nov
(3) |
Dec
(2) |
2015 |
Jan
(2) |
Feb
(1) |
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Hans-Peter D. <hp...@hp...> - 2015-03-25 19:58:56
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, On 03/24/15 23:08, Reinhold Straub wrote: > downstream here! There is a port of pdfgrep for OpenBSD now. Binary > packages for OpenBSD-current will show up in the next couple of days. > Unfortunately, pdfgrep won't make it into the upcoming release version, > but OpenBSD 5.8, due november 1st this year, will be fine. That's great news. Thanks for your work! I haven't tested pdfgrep on OpenBSD, yet. If it ever needs any portability fixes, I'd be happy to merge them upstream. Cheers, HP -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlUTE4wACgkQNPa42FI0o0hQIwD7BEWkXUxcePdNVVhlF+f15BcG 26TzcIr9/OZ9EvxW8VoA/iR8n/k4Ggd2RIKbX1u3KvkpuwkttQRHfFBaPluzFSxi =HCtw -----END PGP SIGNATURE----- |
From: Reinhold S. <dem...@we...> - 2015-03-24 22:08:54
|
Hi list, downstream here! There is a port of pdfgrep for OpenBSD now. Binary packages for OpenBSD-current will show up in the next couple of days. Unfortunately, pdfgrep won't make it into the upcoming release version, but OpenBSD 5.8, due november 1st this year, will be fine. Best regards and thanks a lot for your nice tool, Reinhold Straub |
From: Hans-Peter D. <hp...@hp...> - 2015-03-15 18:03:28
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi everybody, since Gitorious is shutting down at the end of May, I decided to move pdfgrep's git repository to GitLab. The new official clone URL is now: https://gitlab.com/pdfgrep/pdfgrep.git The project page[1] at GitLab features a code browser, a bug tracker and merge requests. Bugs and patches can still be sent to this list, but GitLab is an alternative for those uncomfortable with mailing lists. I updated the README and website accordingly. Please note that for the time being, the official homepage of pdfgrep is still pdfgrep.sourceforge.net. For those who don't like shiny javascript-ridden interfaces, the gitweb at [2] is a mirror of the official repository. Cheers, HP [1] https://gitlab.com/pdfgrep/pdfgrep [2] https://git.cs.fau.de/?p=lu03pevi/pdfgrep.git -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlUFxY0ACgkQNPa42FI0o0j1HQD/WdJr0Qk8pUMCrfsqYo9WJy3w JYOQTNYHMn1g2Xi3ywoA/29de+iEimlex/uXl8WThwTngbZx6bI2ckSrwSqYaRId =huDW -----END PGP SIGNATURE----- |
From: Hans-Peter D. <hpd...@gm...> - 2015-02-20 19:12:20
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi all, I'm happy to announce the release of pdfgrep-1.3.2! This is mostly a bug-fix release, but it also fixes a longstanding and often reported issue: Garbled and excessive error output for broken PDFs is now gone if pdfgrep is compiled against a recent enough poppler version (0.30.0). Here is an overview over what's new: - A bash completion module - Don't limit output to 80 characters on non-terminals - Print a lot less error messages by default (only with >= poppler-0.30.0) - New option --debug to print verbose debug output - Installation: New configure flag --with-zsh-completion In other news, the website [1] got a face-lift! Thanks to everyone who helped with this release! As usual, the release tarball is available at [2]. Cheers, HP [1] http://pdfgrep.sourceforge.net [2] https://sourceforge.net/projects/pdfgrep/files/1.3.2/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlTnhyIACgkQNPa42FI0o0gMGwD+LJqUK7W6yu3pCRxnlI0ZGKuN LG38W+o92Hh13OMwpjkA/RXvud4FUHwmYu6860QNNchS4zft2KAgJ4JR5hfWhnCd =rcww -----END PGP SIGNATURE----- |
From: Hans-Peter D. <hpd...@gm...> - 2015-01-09 12:26:16
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 merged, thanks! Btw, are there any volunteers for bash completion? :) Cheers, HP -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlSvyO8ACgkQNPa42FI0o0iTnwD/ZDYolFNXRdqs/qGK9z1OZuku 4UKdrrejFdVbtdCIM40A+QEwo0PNn+OrIgQZLVF3Hem69lIyVfy3qn6ZFIw4HyUZ =PMhI -----END PGP SIGNATURE----- |
From: Florian S. <fl...@ge...> - 2015-01-09 11:49:58
|
defaults to enabled. The completion directory can be specified with "--with-zsh-completion=<directory>". To not install zsh completion, use "--without-zsh-completion". Allows us to minimize the gentoo ebuild by removing the src_install function and using the default instead. --- completion/Makefile.am | 9 +++++++-- configure.ac | 14 ++++++++++++++ 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/completion/Makefile.am b/completion/Makefile.am index 7446f35..af56c09 100644 --- a/completion/Makefile.am +++ b/completion/Makefile.am @@ -1,2 +1,7 @@ -noinst_DATA = _pdfgrep -EXTRA_DIST = $(noinst_DATA) +zshcompldir = $(ZSH_COMPL_DIR) + +if INSTALL_ZSH_COMPLETION +zshcompl_DATA = _pdfgrep +else +dist_zshcompl_DATA = _pdfgrep +endif diff --git a/configure.ac b/configure.ac index bafcb61..25743c3 100644 --- a/configure.ac +++ b/configure.ac @@ -46,6 +46,20 @@ AS_IF([test "x$with_unac" = "xyes"], [ AC_DEFINE([HAVE_UNAC], [1], [Define to 1 if you have libunac _and_ want to use it]) ]) +AC_MSG_CHECKING([zsh completion]) +AS_VAR_SET([ZSH_COMPL_DIR], ["${datadir}/zsh/site-functions"]) +AC_ARG_WITH([zsh-completion], + [AS_HELP_STRING([--with-zsh-completion=DIR], + [install zsh-completion file in directory DIR])], + [AS_CASE([${withval}], + ["/"*], [AS_VAR_COPY([ZSH_COMPL_DIR], [withval])], + [no], [AS_VAR_SET([ZSH_COMPL_DIR], [])])]) +AC_SUBST(ZSH_COMPL_DIR) +AM_CONDITIONAL([INSTALL_ZSH_COMPLETION], [test "x$ZSH_COMPL_DIR" != "x"]) +AM_COND_IF([INSTALL_ZSH_COMPLETION], + [AC_MSG_RESULT($ZSH_COMPL_DIR)], + [AC_MSG_RESULT(no)]) + AC_CHECK_PROG(HAVE_A2X, [a2x], [yes], [no]) AC_CONFIG_FILES([Makefile completion/Makefile doc/Makefile]) -- 2.0.4 |
From: Hans-Peter D. <hpd...@gm...> - 2014-12-03 10:10:21
|
Hi Flo, thanks for the patch. A few comments: On Di, Dez 02 2014, Florian Schmaus wrote: > -EXTRA_DIST = $(noinst_DATA) This removes _pdfgrep from the tarball created by 'make dist'. > + > +install-data-local: > + $(AM_V_at)test -d "$(DESTDIR)$(ZSH_COMPLETION)" || $(MKDIR_P) "$(DESTDIR)$(ZSH_COMPLETION)" && \ > + $(INSTALL_DATA) "_pdfgrep" "$(DESTDIR)$(ZSH_COMPLETION)" > + > +uninstall-local: > + $(RM) "$(DESTDIR)$(ZSH_COMPLETION)" > + This motivated me to dig into autotools configuration again and I think I found a more autotooly way to do this: zshcompl_DATA = _pdfgrep That would requiere configure.ac to set zshcompldir. > +AC_MSG_CHECKING([zsh completion]) > +AS_VAR_SET([ZSH_COMPLETION], ["${datadir}/zsh/site-functions"]) > +AC_ARG_WITH([zsh-completion], > + [AS_HELP_STRING([--with-zsh-completion=DIR], > + [install zsh-completion file in directory DIR])], > + [AS_CASE([${withval}], > + ["/"*], [AS_VAR_COPY([ZSH_COMPLETION], [withval])], > + [no], [AS_VAR_SET([ZSH_COMPLETION], [])])]) > +AC_SUBST([ZSH_COMPLETION]) > +AM_CONDITIONAL([INSTALL_ZSH_COMPLETION], [test "x$ZSH_COMPLETION" != "x"]) This requires the install directory to be absolute. It also prints "checking zsh completion..." (without "for") but no "yes" or "no". And additionally I'm not particularly happy that a --with-something argument can be anything other than "yes" or "no". I don't know what's usually done in such a case, but I would (aesthetically) prefer --with-zsh-completion=[yes,no] and --zsh-completion-dir=DIR. But maybe that's just me. Sorry for nitpicking on your patch :) Cheers, HP |
From: Florian S. <fl...@ge...> - 2014-12-02 14:43:47
|
defaults to enabled. The completion directory can be specified with "--with-zsh-completion=<directory>". To not install zsh completion, use "--without-zsh-completion". Allows us to minimize the gentoo ebuild by removing the src_install function and using the default instead. --- completion/Makefile.am | 11 ++++++++++- configure.ac | 12 ++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/completion/Makefile.am b/completion/Makefile.am index 7446f35..cf6ad7d 100644 --- a/completion/Makefile.am +++ b/completion/Makefile.am @@ -1,2 +1,11 @@ +if INSTALL_ZSH_COMPLETION noinst_DATA = _pdfgrep -EXTRA_DIST = $(noinst_DATA) + +install-data-local: + $(AM_V_at)test -d "$(DESTDIR)$(ZSH_COMPLETION)" || $(MKDIR_P) "$(DESTDIR)$(ZSH_COMPLETION)" && \ + $(INSTALL_DATA) "_pdfgrep" "$(DESTDIR)$(ZSH_COMPLETION)" + +uninstall-local: + $(RM) "$(DESTDIR)$(ZSH_COMPLETION)" + +endif diff --git a/configure.ac b/configure.ac index bafcb61..062889a 100644 --- a/configure.ac +++ b/configure.ac @@ -46,6 +46,18 @@ AS_IF([test "x$with_unac" = "xyes"], [ AC_DEFINE([HAVE_UNAC], [1], [Define to 1 if you have libunac _and_ want to use it]) ]) +AC_MSG_CHECKING([zsh completion]) +AS_VAR_SET([ZSH_COMPLETION], ["${datadir}/zsh/site-functions"]) +AC_ARG_WITH([zsh-completion], + [AS_HELP_STRING([--with-zsh-completion=DIR], + [install zsh-completion file in directory DIR])], + [AS_CASE([${withval}], + ["/"*], [AS_VAR_COPY([ZSH_COMPLETION], [withval])], + [no], [AS_VAR_SET([ZSH_COMPLETION], [])])]) +AC_SUBST([ZSH_COMPLETION]) +AM_CONDITIONAL([INSTALL_ZSH_COMPLETION], [test "x$ZSH_COMPLETION" != "x"]) + + AC_CHECK_PROG(HAVE_A2X, [a2x], [yes], [no]) AC_CONFIG_FILES([Makefile completion/Makefile doc/Makefile]) -- 2.0.4 |
From: John-Eric <joh...@gm...> - 2014-11-25 16:35:47
|
Good news! pdfgrep with the "--context=line" option fixes the truncation issue. The "--help " option indicates "pdfgrep --context NUM". (No = sign) The stdout then works with tr, sed, and sort as expected. Just two minor issues that I was able to work around with for-in-do and SED. 1. -R (--recursive) didn't work for me, but a work-around is for-in-do. >>>>> for dir in */; do (pdfgrep -i --context=line AcmeWidget $DIR*2014*.pdf 2>/dev/null); done | tr -s ' ' Documents/Capital One 2014-11.pdf: 11 02 NOV AcmeWidget SAN ANTONIOTX $25.50 Documents/Chase Amazon 2014-01.pdf: 01/19 AcmeWidget SAN ANTONIO TX 15.50 Documents/Chase Amazon 2014-03.pdf: 03/21 AcmeWidget SAN ANTONIO TX 21.25 Documents/Chase Amazon 2014-03.pdf: 03/21 AcmeWidget SAN ANTONIO TX 30.25 Documents/Chase Amazon 2014-06.pdf: 06/15 AcmeWidget SAN ANTONIO TX 14.50 Documents/Chase Amazon 2014-08.pdf:07/24 AcmeWidget SAN ANTONIO TX 22.00 Documents/Chase Amazon 2014-09.pdf:08/26 AcmeWidget SAN ANTONIO TX 35.00 Documents/Chase Amazon 2014-09.pdf:08/25 AcmeWidget SAN ANTONIO TX 16.00 Documents/Chase Amazon 2014-10.pdf:10/05 AcmeWidget SAN ANTONIO TX 26.50 Documents/Chase Amazon 2014-10.pdf:10/10 AcmeWidget SAN ANTONIO TX 51.50 2. I added a space using SED after the file name for consistant fields and appearance. for dir in */; do (pdfgrep -i --context=line AcmeWidget $DIR*2014*.pdf 2>/dev/null); done | sed 's/pdf:/pdf: /' | tr -s ' ' Documents/Capital One 2014-11.pdf: 11 02 NOV AcmeWidget SAN ANTONIOTX $25.50 Documents/Chase Amazon 2014-01.pdf: 01/19 AcmeWidget SAN ANTONIO TX 15.50 Documents/Chase Amazon 2014-03.pdf: 03/21 AcmeWidget SAN ANTONIO TX 21.25 Documents/Chase Amazon 2014-03.pdf: 03/21 AcmeWidget SAN ANTONIO TX 30.25 Documents/Chase Amazon 2014-06.pdf: 06/15 AcmeWidget SAN ANTONIO TX 14.50 Documents/Chase Amazon 2014-08.pdf: 07/24 AcmeWidget SAN ANTONIO TX 22.00 Documents/Chase Amazon 2014-09.pdf: 08/26 AcmeWidget SAN ANTONIO TX 35.00 Documents/Chase Amazon 2014-09.pdf: 08/25 AcmeWidget SAN ANTONIO TX 16.00 Documents/Chase Amazon 2014-10.pdf: 10/05 AcmeWidget SAN ANTONIO TX 26.50 Documents/Chase Amazon 2014-10.pdf: 10/10 AcmeWidget SAN ANTONIO TX 51.50 I also used 2>/dev/null to suppress the "file or directory not found" messages. Thanks for your excellent work. John On 11/24/2014 10:50 AM, Hans-Peter Deifel wrote: > Hi, > > ... have you tried playing with the --context option? A value of 'line' should print the whole line. If that's too much, you can also pass the exact number of characters that are be printed around each match. The problem is, that pdfgrep tries to be clever and truncates the output to fit the width of the terminal. If you pipe the output to a command or a file, it detects that there is no terminal and assumes a maximum width of 80 charactes. I realize that this is unintuitive. It probably would be better to just print the whole line if we don't have a terminal. I'll fix it, but I need to reevaluate the whole concept of contexts first :) |
From: Hans-Peter D. <han...@fa...> - 2014-11-24 17:06:58
|
Hi, On Mo, Nov 24 2014, John-Eric wrote: > I'm having trouble with pdfgrep output too wide to see on screen. > Any attempt to suppress extra spaces truncates the values at the end > of each line. > How can I squeeze the spaces and keep the values? > have you tried playing with the --context option? A value of 'line' should print the whole line. If that's too much, you can also pass the exact number of characters that are be printed around each match. The problem is, that pdfgrep tries to be clever and truncates the output to fit the width of the terminal. If you pipe the output to a command or a file, it detects that there is no terminal and assumes a maximum width of 80 charactes. I realize that this is unintuitive. It probably would be better to just print the whole line if we don't have a terminal. I'll fix it, but I need to reevaluate the whole concept of contexts first :) > Could you add a squeeze space/tabs option? That seems to be awfully specific. I'd rather make piping to tr work. Cheers, HP |
From: John-Eric <joh...@gm...> - 2014-11-24 04:15:41
|
<html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> <font face="Courier New, Courier, monospace">I'm having trouble with pdfgrep output too wide to see on screen.<br> Any attempt to suppress extra spaces truncates the values at the end of each line.<br> How can I squeeze the spaces and keep the values?<br> <br> >>>>> pdfgrep AcmeWidgets *.pdf<br> <br> CapitalOne 2014-11.pdf: 11 02 NOV AcmeWidgets SAN ANTONIOTX 24.50<br> Chase Amazon 2014-01.pdf: 01/19 AcmeWidgets SAN ANTONIO TX 21.12<br> Chase Amazon 2014-03.pdf: 03/21 AcmeWidgets SAN ANTONIO TX 18.45<br> Chase Amazon 2014-03.pdf: 03/21 AcmeWidgets SAN ANTONIO TX 20.66<br> Chase Amazon 2014-06.pdf: 06/15 AcmeWidgets SAN ANTONIO TX 26.50<br> Chase Amazon 2014-08.pdf: 07/24 AcmeWidgets SAN ANTONIO TX 22.48<br> Chase Amazon 2014-09.pdf: 08/26 AcmeWidgets SAN ANTONIO TX 17.56<br> Chase Amazon 2014-09.pdf: 08/25 AcmeWidgets SAN ANTONIO TX 21.52<br> Chase Amazon 2014-10.pdf: 10/05 AcmeWidgets SAN ANTONIO TX 28.45<br> Chase Amazon 2014-10.pdf: 10/10 AcmeWidgets SAN ANTONIO TX 18.45<br> <br> =============================================================================<br> This is what I get when I pipe the output and squeeze multiple spaces to a single space.<br> Now I have the right width, but the values at the end of each line are missing.<br> <br> >>>>> pdfgrep AcmeWidgets *.pdf | tr -s ' '<br> <br> CapitalOne 2014-11.pdf: 11 02 NOV AcmeWidgets SAN ANTONIOTX<br> Chase Amazon 2014-01.pdf: 01/19 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-03.pdf: 03/21 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-03.pdf: 03/21 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-06.pdf: 06/15 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-08.pdf: 07/24 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-09.pdf: 08/26 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-09.pdf: 08/25 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-10.pdf: 10/05 AcmeWidgets SAN ANTONIO TX<br> Chase Amazon 2014-10.pdf: 10/10 AcmeWidgets SAN ANTONIO TX<br> </font><br> <font face="Courier New, Courier, monospace"><font face="Courier New, Courier, monospace">=============================================================================<br> </font>I tried using other commands with the same results:<br> sed ===> pdfgrep AcmeWidgets *.pdf | sed 's/ \+/ /g'<br> awk ===> pdfgrep AcmeWidgets *.pdf | awk '{print $1, $2, $3, $4, $5, $6, $7, $8, $9, $10;}'<br> <br> </font><font face="Courier New, Courier, monospace"><font face="Courier New, Courier, monospace">The values also get truncated when I try to pipe the output to a file.<br> </font>>>>>> </font><font face="Courier New, Courier, monospace"><font face="Courier New, Courier, monospace">pdfgrep AcmeWidgets *.pdf > output.txt<br> </font><br> Could you <font face="Courier New, Courier, monospace">add a squeeze space/tabs option? </font>Any other suggestions?<br> <br> </font> </body> </html> |
From: Loïc C. <lc...@dc...> - 2014-10-03 17:57:55
|
Oi! Le vendredi 03 octobre 2014 à 18:09 +0200, Hans-Peter Deifel a écrit : > Since you apparently know about pdfgrep -p, I'm curious: What was the > reason for calling pdftotext repeatedly instead of using the output of > pdfgrep directly to find matching pages? Well, I could have (and now that the Full Circle Magazine described it, I will let it as it is). That would have been one more dependency though. But you are right: that would probably advantageous in term of performance (not to repeatedly call 'pdftotext') and the script would probably be shorter (with quite a long command piping 'pdfgrep' into 'cut -f: -d1', then 'sort -un' and finally into 'tr \n ,'). I would do that to construct the page selection 'pdfjam' takes in argument. I chose 'pdfjam' because, as far as I know, nothing in "poppler-utils" allows to select at once a list of non-contiguous pages. I was initially using 'pdfseparate' to extract single pages but I could then exceed the maximum number of authorized parameters (when calling 'pdfunite') and the script was much slower than the current solution based on 'pdfjam'. Again: thank you for 'pfgrep'! -- Magic Banana (GPG keyid: 5CF04396) |
From: Hans-Peter D. <hpd...@gm...> - 2014-10-03 16:09:41
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, Thanks for the email, it was very much appreciated. pdf-page-grep looks like a useful little tool. On Di, Sep 30 2014, Loïc Cerf wrote: > Contrary to 'pdfgrep', it is a very simple Shell script that lacks > useful options (such as -r or -p). I wrote it in a few hours for fun > and to help a Trisquel user In fact, pdfgep also started out as a bash script (kudos to Thorsten) and my original reason for rewriting it in C was that repeated calls to pdftotext for every single page got slow with big PDFs. Using the poppler-API directly avoids reparsing the whole document for each page. Since you apparently know about pdfgrep -p, I'm curious: What was the reason for calling pdftotext repeatedly instead of using the output of pdfgrep directly to find matching pages? I realize that pdfgrep doesn't have a -f switch (yet), but searching for multiple patterns is obviously possible with "|". Cheers, HP -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlQuykIACgkQNPa42FI0o0i1NgD+OaJ8OjldydBLNbTOZIScRfyF mU5cxG5Tr6BGtXv6Bb0A+waLDbJMdwDp2bpkYd4sTSF6f6IS88V0JoY0WZZeuCkz =4/Zy -----END PGP SIGNATURE----- |
From: Loïc C. <lc...@dc...> - 2014-09-30 03:20:04
|
Oi! This email just to mention the existence of 'pdf-page-grep', which solves a variation of the problem 'pdfgrep' tackles. Instead of outputting the matching lines, it creates a PDF file where the matching pages are concatenated. Contrary to 'pdfgrep', it is a very simple Shell script that lacks useful options (such as -r or -p). I wrote it in a few hours for fun and to help a Trisquel user: https://trisquel.info/fr/forum/finding-particular-pages-within-pdfs Here is the description of 'pdf-page-grep': http://dcc.ufmg.br/~lcerf/en/utilities.html#pdf-page-grep It mentions 'pdfgrep'. ;-) -- Magic Banana (GPG keyid: 5CF04396) |
From: Hans-Peter D. <hpd...@gm...> - 2014-08-14 20:31:03
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 hi Joachim, nice to hear that you want to package pdfgrep for nixos. On Mi, Aug 13 2014, Joachim Schiele wrote: > i just wanted to package pdfgrep for nixos but it does not find the > poppler data directory: > > https://gist.github.com/qknight/bc77d9d984053c8db753 > > one major difference, between nixos and all other distributions, is > that we don't have global directories like on ubuntu, where poppler > could expect poppler-data to be in > /var/lib/ghostscript/CMap/Adobe-Korea1-1 or similar. > > i got it running using this runtime hack: > > export > POPPLER_DATADIR=/nix/store/pqbgh8k6674fvf844z3040d3zhgk29sc-poppler-data-0.4.6/share/poppler/ > > but can that path be hardcoded into pdfgrep somehow, so that i don't > have to export this on runtime? I don't know anything about nixos, sorry. But hardcoding such paths into all the binaries using poppler seems to be the wrong (as in 'very ugly') approach. A better place for this could be the poppler library itself, or some global configuration variable/file. In fact, after a little googling I found [1] which looks like it's the nixos package for poppler-data. As you can see, it exports POPPLER_DATADIR in a shell script under /etc/profile.d/. The files there should be sourced by your global shellrc. If they aren't, that could be a configuration error on your side. Hope that was helpful to you. Cheers, HP [1] https://raw.githubusercontent.com/NixOS/nixpkgs/master/pkgs/data/misc/poppler-data/default.nix -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlPtHIUACgkQNPa42FI0o0g+HAD/V7dvbBqRK6VelQotyOdVVgah i5szbLZhCD+CbOFvrkEA/2TJYUbrZy0XLpoVayXvCha5j6nFGvPtKrGqW6/dSMdP =wyBD -----END PGP SIGNATURE----- |
From: Joachim S. <js...@la...> - 2014-08-13 18:14:31
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi, i just wanted to package pdfgrep for nixos but it does not find the poppler data directory: https://gist.github.com/qknight/bc77d9d984053c8db753 one major difference, between nixos and all other distributions, is that we don't have global directories like on ubuntu, where poppler could expect poppler-data to be in /var/lib/ghostscript/CMap/Adobe-Korea1-1 or similar. i got it running using this runtime hack: export POPPLER_DATADIR=/nix/store/pqbgh8k6674fvf844z3040d3zhgk29sc-poppler-data-0.4.6/share/poppler/ but can that path be hardcoded into pdfgrep somehow, so that i don't have to export this on runtime? best wishes, joachim schiele - -- Joachim Schiele 0176 3090 333 7 blog: http://blog.lastlog.de wiki: http://lastlog.de jabber: jo...@ja... GPG: C6AC8770 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJT66bbAAoJEJhk1bLGrIdwKPEP/jDS0zw53endqSQ2yEaSkc8y G874PyZR8QM+nOUJNXKnUfr1DkdMMWkMJFmfeSEf4OjdjlCp7OSL6BSQwA+BoFhi l6qkg6fQFOw+mf9dg+xkQb52kw85q301KcCKtNo4m0UOicqeZZemqjJJisp2FqxP VELP9GS37AABmtRCEMv/ByipFbXuLjPzdaqSLxIbkVzFcvhOgbmQd8RKU6D5Ma6r ELfgZtFpkAnetPmewIs9jpTLvw6V+2ErTfLkJV6IcUf7NbNpus2xXc3+7jgS9BEH K5kSWUSW4zN5yPXCNRPHO6Me+PsarG+qgQivISeVQVDz7Ij/YIlVa8AInsQDFogc 3F2Z0EF2340jUOgtmkzNUUKWNGUcnVoJRKAj+EvkqZU/C0mrYezwKkeVMcTso9hY ZnexThWDYjJeJ8X3/J7BhoyI7SDVeGO4KAcfM/0Kj/6xG4yFegufZVN36gr/xOka yBCJWkqpCXWY3pieAAhUHtVkDVTtgGJuBPCWz1gscPVwmQb1u7G9NDo7ZpK23dK8 tSNcMDqLhfEb034cgSG2F4XRJLAizBFgrx/1L4XOKpHEHJge1oXPN+qYdSwpqva9 SWpmcAvk8pI4EvBYD4aTSrcvoCUF1ncowgIxRmNuEvZHcIEaUUrmV06v0G97Vor7 Ec6LR5pXCB4I2tkVwdkg =UbXz -----END PGP SIGNATURE----- |
From: Hans-Peter D. <hpd...@gm...> - 2014-08-10 22:29:32
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi list, After more than two years without a release, I'm pleased to announce pdfgrep-1.3.1. This is a minor release but over time a few features accumulated. Here are the most interesting changes: - INCOMPATIBLE CHANGE: -r doesn't follow symlinks (compatibility with new versions of GNU grep) - A zsh completion module in completion/ - Support for password-protected PDFs with --password - Allow to omit '.' with -r or -R to search the current directory - Add -p or --page-count to count matches per page (by Jascha Knack) - Add -m or --max-count to limit matches per file (by Thibault Marin) Also, asciidoc is now required as build dependency for the manpage. Thanks to all the contributors. Pdfgrep is available through git at http://gitorious.org/pdfgrep or as tarball at https://sourceforge.net/projects/pdfgrep/files/. Regards, HP - -- Hans-Peter Deifel pdfgrep maintainer -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlPn8kcACgkQNPa42FI0o0iv3AD+JBCARfpRZFd26cCuQxdVp6sq a5QreVn9Zjqi3QHGTgoA/2jUM23k2vzSzrkG0ULfvMVdrt0rIf3WyoAAEIr61O12 =69L2 -----END PGP SIGNATURE----- |
From: Hans-Peter D. <hpd...@gm...> - 2014-07-24 20:31:16
|
Hi, sorry for the late answer, I had problems with my mail-setup. On Mi, Jul 16 2014, Prem Ananth wrote: > I am trying to work with pdfgerp. i download the zip file from > http://sourceforge.net/projects/pdfgrep/files/ > After that I am unable to go forward on installing the pdfgrep, > And also can u give the syntax for executing the pdfgrep > Can you please help on this. If you don't have experience in building software from source, I suggest you install pdfgrep with the package manager of your Unix distribution. If you absolutely have to build it from source, the process is described on this page: http://askubuntu.com/questions/123077/installing-applications-from-source If you use a non-Unix operating system, I'm sorry that I can't help you, as I don't own such a system. Kind regards, HP |
From: Prem A. <Pre...@tr...> - 2014-07-16 13:57:11
|
Hi I am trying to work with pdfgerp. i download the zip file from http://sourceforge.net/projects/pdfgrep/files/ After that I am unable to go forward on installing the pdfgrep, And also can u give the syntax for executing the pdfgrep Can you please help on this. Regards, Prem Ananth S |
From: Hans-Peter D. <hpd...@gm...> - 2013-11-07 20:08:39
|
Hey, thanks for providing more information. The error messages that you see aren't real errors, but merely warnings from the poppler library. Poppler unconditionally prints these to the terminal without giving pdfgrep a chance to get its hands on it. You can get rid of them if you pipe the error output to /dev/null (at least in normal unix, no idea about cygwin) like this: pdfgrep foo bar.pdf 2> /dev/null Unfortunately you won't see legitimate errors if you do this. Problems like yours came up a few times already and I'm trying to work around the limitations in poppler. The proper solution would be a fix of poppler-bug 70374 [1], but that can take a while. So, sorry that you have to put up with this annoyance for now, I'll try to find a solution. Regards, HP [1]: https://bugs.freedesktop.org/show_bug.cgi?id=70374 |
From: John F. <for...@gm...> - 2013-11-05 19:01:01
|
pdfgrep.exe -in '\bcable' *.pdf ----- poppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destinationpoppler/error: Bad annotation destination |
From: Hans-Peter D. <hpd...@gm...> - 2013-11-05 00:01:41
|
Hello John, On Mo, Nov 04 2013, John Franklin wrote: > Hello, where would I need to send a bug report? This list is exactly the right place. As the README says: >> To send patches, ask questions, report bugs or anything >> else there is now a new mailing-list: >> >> pdf...@li... > I continue to get poppler/error: Bad annotation destination and the > poppler channel mentioned I should send pdfgrep a bug report. Thanks! Could you provide more details please? How do you get this error? On what PDFs? Cheers, HP |
From: John F. <for...@gm...> - 2013-11-04 19:31:40
|
Hello, where would I need to send a bug report? I continue to get poppler/error: Bad annotation destination and the poppler channel mentioned I should send pdfgrep a bug report. Thanks! John |
From: Hans-Peter D. <hpd...@gm...> - 2013-10-05 11:16:40
|
Hello, thanks for flying pdfgrep :) On Fr, Okt 04 2013, Christof Schöch wrote: > Dear pdfgrep users and developers, > > When searching in some files, I get the following type of result: > > xxxxxxx@Acer:~/Dropbox/Library/xxxxxx/xxxxxxxx/ABCDE$ pdfgrep -i -n -R -C > line author Allison_2011-QuantitativeFormalism-LitLabPamphlet1.pdf > poppler/error: Invalid Font Weightpoppler/error: Invalid Font <snip> > This happens with some pdf files but not with others. Not sure what is > special about those PDF files. One file it happens with is attached here. > The poppler developers refuse all responsibility so I'm contacting you. The poppler library itself prints those messages and it's kinda awkward to get hold of them as application developer. It seems that this particular PDF is slightly malformed, poppler complains about an invalid font weight on every page of it. To get rid of the messages on your terminal, simply pipe the standard error stream to /dev/null, like so: $ pdfgrep pattern file.pdf 2> /dev/null I'll see what I can do to make the output less garbled in the case of error messages. > Since I'm writing anyway, please help me out with anther question. What is > the syntax for multiple files to be searched in a folder (and its > subfolders)? using "*.pdf" does not work for me, I get a "file not found" > error message. *.pdf (without quotes) should search through all pdf files in the current directory. If you want this to recurse into subdirectories, you have to use the -R switch: $ pdfgrep -R pattern . Note the dot at the end to tell pdfgrep to start in the current working directory. Grep recently allowed to omit this dot, so pdfgrep will probably do that in the near future, too. > Thanks for any hints, > Christof > > > -- > http://www.christof-schoech.de > http://dragonfly.hypotheses.org > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk > _______________________________________________ > pdfgrep-users mailing list > pdf...@li... > https://lists.sourceforge.net/lists/listinfo/pdfgrep-users |
From: Hans-Peter D. <hpd...@gm...> - 2013-07-19 08:59:16
|
Hi, Thanks you for your feedback, It's greatly appreciated. On So, Jul 14 2013, Pierdamiano Venti wrote: > This tool is fantastic and would love to know if could be available also in > .apk for Android Terminal users? Unfortunately I have absolutely zero experience with developing for Android. From what I gather, such a package would have to bundle the poppler library and I don't know how easy that is. So I'd encourage everyone who wants to make an Android package, but I probably won't do it myself anytime soon, lacking the knowledge and also time. > Even having a GUI version would be fun to use and could gain a broader > attention. Do you mean a GUI for Android or generally a GUI? In either case, I think a GUI version of pdfgrep would best be a separate tool, because it wouldn't need to transform the PDF file to plain text which is what pdfgrep is all about. Also, since every sane PDF-viewer can search through a single PDF, the only use case of a GUI-pdfgrep would be searching multiple files. Common file indexing tools like KDE's strigi or GNOME's tracker already can do that, but require a database and an indexing service. Something like KFind for PDFs could be made rather easily, but I'm not certain how many people would use it, since it wouln't offer that much convenience over the command line version. A very simple Tcl/Tk wrapper around pdfgrep should also be doable in half an hour, but wouln't offer anything over the CLI tool apart from clickable PDF-URLs and not having to remember the names of options. Let me know if you have need for that. As I said, it's really easy. And on a funny note: Emacs' "M-x grep" and "M-x find-grep" commands can be used as a simple pdfgrep frontend ;) > Thanks allot for the great tool I'm using in Linux. I'm glad you find it useful. Regards HP |