| <<<Back 1 day (to 2016/01/29) | 20160130 |
JackDD | Hello. A simple question, with ghostscript is possible to get/search a word in a PDF? | 09:26.03 |
kens | Ghostscript has no interactive features, so no. However you can extract the text | 09:26.23 |
JackDD | My goal is extract pages if a word match. So isn't possible? Do you have any alternative? Thanks | 09:27.01 |
kens | Did you ask thsi same question on STack Overflow ? | 09:27.17 |
JackDD | No | 09:27.24 |
kens | Well I answered the question there :) | 09:27.42 |
JackDD | Oh, you have a link? | 09:28.00 |
kens | You cannot 'extract' pages using Ghostscript anyway. You can make a new PDF file whcih 'looks the same' as the original but its not created by extracting the original page | 09:28.24 |
| Creating the new PDF file is complex. The easy way to do ths would be to use text extraction to get all the text out, sorted into pages, then grep the text you want in those file. FInally use the page numnbe | 09:29.08 |
| number and use pdfwrite to make a new PDF of that page. | 09:29.25 |
JackDD | but i can delete pages right? | 09:29.29 |
kens | Also no. | 09:29.35 |
| Ghostscript *never* works by 'manipulating' the input | 09:29.54 |
| If you use the pdfwrite device, the output is totally new and bear no relation to the input other than the visual result when rendered | 09:30.21 |
JackDD | i can use dFirstPage=m -dLastPage=n | 09:30.47 |
kens | Yes, and ? THe output of that is a new PDF file where the pages *look* the same as the original. | 09:31.14 |
| TYhe actual PDF content streams will not be the same | 09:31.24 |
JackDD | so i think i need to use something different than Ghostscript but what? | 09:31.50 |
kens | I have no idea, fortunately that's not *,y* problem :-D | 09:32.13 |
| Stack Overflow link: | 09:35.07 |
| http://stackoverflow.com/questions/35009388/ghostscript-extract-pages-containing-a-text-string | 09:35.07 |
JackDD | I'm search some library can do that. | 09:39.59 |
| Forward 1 day (to 2016/01/31)>>> | |