IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2016/06/29)20160630 
tor8 Robin_Watts: I'm looking at the code in pdf-clean-file.c that creates subsets of pages12:52.41 
  it looks like it's going to fail whenever any page items are inherited from the page tree (such as MediaBox and Resources)12:53.21 
inkbottle How can I check for "existing embedded annotations" in pdf file (looking for a workaround to https://bugs.kde.org/show_bug.cgi?id=357526)14:47.16 
kens Well you could grep for Annotation14:47.45 
Robin_Watts That's an okular bug, right?14:48.51 
kens Make that 'Annots'14:48.59 
inkbottle Robin_Watts: kens: yes... because of it I cannot annotate the document in okular14:50.21 
kens Not *our* problem though14:50.34 
inkbottle there are 2 examples of files that trigger the bug in the bug page14:50.43 
kens Why not ask the Okular people ?14:50.45 
  Not being funny, but why should we spend any time looking at this ?14:51.06 
inkbottle no, my question is: I would like to try to see if there are such annotations14:51.22 
  and 2, remove them14:51.33 
kens I gave you one answer, you didn't say anything about removing them14:51.47 
  It would be possible to do that with Ghostscript, but you'd have to hack the PDF interpreter so that it didn't pass on annotations. Of course, you'd be getting a different PDF file, but then, that's what you want14:52.39 
inkbottle kens: OK, yes you did, I'll do the grep then and be back14:52.44 
Robin_Watts Neater solution would be to do it by manipulating the PDF file using "mutool run".14:53.33 
  tor8: ^14:54.04 
inkbottle kens: I did the search and it yielded a positive match; and a negative one with a modified version obtained through "pdfjam" (the modified version lost TOC)14:57.44 
  Robin_Watts: ok for mutool run14:58.00 
Robin_Watts inkbottle: A simple grep finding "Annots" is a pretty good indication that a file has annotations. There may be false positives.14:58.29 
  Certainly there will be false negatives however.14:58.38 
kens Hmm which annotations can you have that are not represented by an Annots in teh page dictionary ?14:59.11 
Robin_Watts kens: If object compression is on, a grep will fail.14:59.43 
kens Are page objects compressed ? I can never remember15:00.03 
Robin_Watts All objects can be compressed, except for the trailer stuff, AIUI.15:00.23 
kens If so then you could do it by having Ghostscript do the job for you, in the manner of pdf_info.ps15:00.24 
inkbottle Robin_Watts: yes to "false positive", but the second clue obtained with the same file treated with "pdfjam" increases the "likelihood"15:00.25 
  Reading mutool man page15:01.13 
kens wonders why I can never find pdf_info.ps is it in toolbin ?15:01.29 
  Oh, yes it is15:01.40 
  Well a relatively simple modification to pdf_info.ps would allow it to determine if Annotations are present15:02.18 
  The code already checks for Annots in order to find out which fonts are used, so it would be trivial to add a detction for it15:03.02 
Robin_Watts A mutool run based solution seems better to me. 1) It would preserve more of the original file structure, 2) it requires coding in javascript, not Postscript - I know which one most people will find preferable.15:04.05 
kens I'm simply offering alternaties. And it would be very easy to add the detection to the existing info in pdf_info.ps15:04.42 
inkbottle Robin_Watts: I did "mutool clean -d hello.pdf hello_clean.pdf" (http://ghostscript.com/pipermail/gs-cvs/2016-April/019748.html about adding mutool run documentation)15:06.56 
  Robin_Watts: As usually you wont advise me to remove the Annots by hand...15:07.34 
  ;-)15:07.46 
kens Apparently in that file only page 1 uses Annotations, and it is also the only page which uses transparency15:07.57 
Robin_Watts inkbottle: mutool clean -d will produce a decompressed version. You don't need to do that if you use mutool run.15:08.13 
  You do need to write a javascript script to manipulate the PDF file.15:08.36 
kens will commit the modified pdf_info.ps, may as well have more informaiton available.15:08.38 
Robin_Watts Various examples of such scripts are given in the documentation added in that commit.15:08.55 
inkbottle Robin_Watts: OK...15:09.27 
Robin_Watts The expert here is tor8.15:09.44 
inkbottle ...15:10.36 
kens Ah Ghostscript already has a feature to do this. If you set ShowAnnots to false I believe pdfwrite will produce a PDF with the Annotations stripped out15:10.54 
  So gs -sDEVICE=pdfwrite -dShowAnnots=false -sOutputFile=stripped.pdf in.pdf should do it15:11.17 
  Ghostscript isn't very happy about page 1 of that file anyway, looks like its been modified and broken in the process15:13.33 
  Of course, a smaller file would be better15:13.49 
  Seems GS doesn't like a lot of pages in that file, there's a bad Form in there. It doesn't affect the output as far as I can tell, but I'm not going to carefully check 1442 pages15:15.09 
  The resulting file has no annotatins. So there is a complete answer.15:16.05 
inkbottle kens: simply astonishing15:16.05 
kens SO I had to modify ghostpdl/toolbin/pdf_info.ps, I added:15:17.52 
  dup /Annots pget {15:17.53 
  pop15:17.53 
  ( Page contains Annotations) print15:17.53 
  } if15:17.53 
  At line 115 of that file15:17.53 
inkbottle Robin_Watts: Ken: I will have to dig into mutool sometimes ;-) (Also I'm happy I didn't today because javascript is not my forte)15:18.15 
kens The caveats thaqt Robin mentioned apply, MuPDF would do less interpretation of the PDF file, and the resulting output would be 'moe like' the original than the Ghostscript output. As I said, there are some problems with that file (there's a fomr with unbalanced q and Q operators) and its always possible that the result will not be acceptable. MuPDF is generally better for 'manipulating' PDF files, but it happens that we already have a canned so15:20.26 
HenryStiles I created an email list for mupdf mobile stuff, if you didn't just get an email than you are not on it, if you want to be on it, let me know15:23.04 
inkbottle kens: Also I'm in the process of learning ps and pdf (the pdf doc was PLRM.pdf), so I'm happy there was an easy solution15:24.46 
  kens: Robin_Watts: thanks a lot15:25.01 
kens You're welcome15:25.06 
inkbottle kens: I'm putting the gs command as a workaround in the aforementioned kde bug, giving source #ghostscript irc channel: I believe it's alright to you15:45.01 
kens Certainly, please do include a warning about the conversion, you can point people to the Ghostscript documentation, partiocularly the overview in VectorDevices.htm15:45.39 
tor8 inkbottle: if you check out mupdf, the mutool run docs are in docs/mutool/run.html with several examples in docs/mutool/examples/21:39.18 
inkbottle tor8: got it, thanks22:03.04 
 Forward 1 day (to 2016/07/01)>>> 
ghostscript.com
Search: