Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2017/02/05)20170206 
tor8 sebras: for the logs. in PDF the 0 byte is whitespace. see table 3.1 white-space characters.11:17.39 
  I suspect Robin_Watts just picked the first best iswhite() function for use in mudraw :)11:18.13 
  the use of octal characters bothers me though... I'd just use decimal if I were to have written it today.11:19.26 
Robin_Watts tor8: i think I have a couple of commits pending.11:30.03 
tor8 Robin_Watts: fix win32 and -i to invert LGTM, though maybe use capital -I for invert?11:31.09 
  to match the -I invert flag of mudra11:31.39 
  w11:31.40 
Robin_Watts Good point. Will fix.11:31.47 
  fixed commit, plus 1 more tiny one.11:36.04 
tor8 Robin_Watts: might want to fix '-i' in the commit message too?11:37.31 
Robin_Watts D'Oh.11:37.41 
  Done.11:38.25 
tor8 Robin_Watts: those 3 commits LGTM.11:38.57 
Robin_Watts Ta11:39.12 
tor8 I tried an approach to improve staying on the same "page" when doing text re-layout11:40.01 
  I'm not entirely happy with it though11:40.07 
  I take a temporary 'bookmark' to a location in the text (the first bit of text on a page), layout the document with the new font size, and find the page that contains the same marked location again11:41.06 
  but it's obviously not reversible. changing font size up and then back down to the same, you end up at a different page11:41.32 
Robin_Watts How about... take a temporary bookmark to a location in the text.11:42.05 
tor8 maybe I'm just trying too hard, and there's a simpler approach by just using current_page / page_count?11:42.07 
Robin_Watts Keep that temporary bookmark until you change page.11:42.17 
  That way if you zoom up and down, the bookmark is still in the same place.11:42.39 
tor8 taking the bookmarks is expensive, but keeping it around unless a page change occurs might help that use case yeah11:43.11 
  my other gripe is how they are temporary and that's going to lead users into confusion trying to save the bookmarks in a preference file or something11:43.38 
  at the moment I just use a raw pointer to the fz_html_flow node11:44.09 
Robin_Watts I'm not sure what else you could use, unless you start using counts of html flow entries or something - and that'll be screwed when we change the structure at all.11:45.05 
tor8 yeah. still not thrilled about the API implications though.11:45.35 
acharles Hi.19:57.02 
ray_laptop acharles: hi back21:07.06 
  (1 hr later)21:07.20 
acharles haha21:07.36 
  I had a few questions about ghostscript and the way it handles pdf files.21:07.57 
ray_laptop acharles: go ahead21:08.10 
  I can most likely answer those21:09.02 
  the general answer is "very well" :-)21:09.38 
acharles Is the postscirpt interpreter used to process pdf files as well or does pdf have a different code path?21:10.09 
ray_laptop the PS interpreter processes the PDF input, invoking PS operators to actually do stuff (images, text, other graphics)21:13.24 
  the PostScript that does that is in Resource/Init/pdf*.ps21:13.47 
  Note that the "scanner" that processes PDF input is in C -- it's not like PostScript is trying to read the PDF directly (except at the very start to find out if the input is PDF or not)21:15.30 
acharles Ah, so you take advantage of the fact that pdf is a subset of postscript to use postscript functions to process the pdf.21:15.51 
ray_laptop acharles: well, PDF is actually a disjoint set (not a subset), but yes, the syntax is similar enough that the scanner has only a few special "tweaks" for PDF 21:17.23 
  but things like << ... ... >> defining a dictionary and strings being enclosed in (...)21:18.17 
acharles Ah, I didn’t know it was disjoint.21:18.41 
ray_laptop well, at the operator level, PDF has transparency and the concept of "streams" but doesn't have some of the noisome PS operators like file manipulation21:20.01 
acharles Ah, I guess that makes sense.21:21.02 
ray_laptop but our scanner has "PDF_SCAN_RULES" for a couple of exceptions21:23.47 
kens lurking21:23.49 
ray_laptop hi kens. ISTR there is something about names in PDF as well, right ?21:24.24 
kens spaces in names21:24.37 
  and other non-priontable characters21:24.44 
  or even printable21:24.52 
  THe original point was that hte graphics model of PDF matched that of PostScript, so a one-to-one mapping was trivial, it tgherefore made sense to wite a PDF processor in PostScript.21:25.36 
  Since then, well things have changed....21:25.45 
  And many PDF files break the specification, but Acrobat opens them so we have to too. Which makes our handling much, much more complicated than it should be.21:26.42 
ray_laptop ah, it is that ANY character except NUL can be in a name using hex21:26.43 
kens I think a NULL can be in a name too, you just escape it with #21:27.03 
ray_laptop kens: PDF 1.7 spec section 3.2 (p 57) excludes NUL21:27.52 
kens acharles is there a reason for wanting to know all this ? Its probabl;y not useful....21:27.59 
  ray_laptop well, this is all form memory for me, I don't have the spec open in front of me21:28.12 
acharles Yes, there is. :)21:28.16 
ray_laptop kens: I cheated and opened the spec :-)21:28.30 
kens It might be easier to explain what your goal is21:28.35 
acharles My goal is ‘secure pdf processing’, but that’s vague. I’m just doing some investigative work and I figured asking here made more sense than reading the code for days on end. :)21:30.20 
kens Well, PDF is pretty secure if you avoid JavaScript21:30.39 
acharles Yes, but postscript is not21:30.57 
kens Though its also a good idea to prevent PostScript XObjects21:31.01 
ray_laptop kens: I think GS disables PS XObjects by default21:31.38 
kens acharles, but you cannot execute random PostScript in a PDF file using Ghostscript21:31.45 
  Obviously if you send PostScript that's a different matter21:32.03 
ray_laptop acharles: and GS has "SAFER" mode that is supposed to make it more secure21:32.04 
  (for PS or PDF input)21:32.30 
acharles Yeah, I’m assuming -dSAFER is enabled.21:32.44 
ray_laptop and since GS doesn't use JS, there isn't a problem there21:32.57 
acharles How does GS detect pdf vs ps input?21:33.08 
kens THough (as the recent news showed) if you are running a job server its a good idea to set the job server password to something other than 0 :-)21:33.26 
ray_laptop acharles: using PS code in Resource/Init/pdf_main.ps21:33.30 
kens acharles depensd how you invoke it21:33.38 
  ray_laptop you can use pdfrun directly21:33.46 
ray_laptop kens: true, then we don't even try to "detect:21:34.04 
acharles how does pdfrun work?21:34.21 
ray_laptop acharles: basically if you don't use pdfrun and just "run" an input file, it looks at the first 1024 bytes for the PDF header21:34.53 
kens Its an internal Ghostscript thing, you give it a filename and it runs it as a PDF file21:34.54 
  Hmm, actually that may not be entierly correct.21:35.32 
  Probably best not to rely on memory at this time of night21:35.43 
ray_laptop kens: actually, I think you have to send it a PS file21:35.50 
  filetype, not a string that contains the filename21:36.07 
kens ah runpdfbegin maybe21:36.31 
  Oops no there it is, runpdf, which calls runpdfbegin :-)21:36.57 
ray_laptop acharles: so you need to make a PDF file type, which can be done with: (filename.pdf) (r) file runpdf21:37.14 
  kens: right -- they both expect a filetype21:37.37 
kens Indeed21:37.42 
  It would be trivial to define a function to take a filename, but why bother....21:38.07 
ray_laptop kens: agreed21:38.19 
acharles Ah, that’s not exposed as a command line option?21:38.23 
kens You can use -c and -f21:38.33 
  to send PostScript directly21:38.40 
  so -c "(filename.pdf) (r) runpdf" -f21:39.01 
ray_laptop kens: the -f doesn't really do anything other than get out of -c mode, so is rather useless if -dBATCH is given21:40.15 
kens Note that the pdf*.ps files constitute a rather large PostScript program, one of the things it will do is attempt to validate the PDF file. So if you send it a PostScript file it **won't** run it, it will just complain its not a valid PDF file21:40.25 
acharles What does the (r) parameter mean? I mean, it pushes r on the stack.21:40.35 
kens makes it readable, like +r in C21:40.45 
ray_laptop acharles: you can also do: echo (filename.pdf (r) file runpdf | gs ... -21:40.47 
kens If you wnted a writable file you would use (w)21:41.11 
acharles Ah21:41.36 
ray_laptop acharles: for that refer to the PLRM21:41.38 
acharles Ah, file is the PS operator for opening a file21:42.16 
  that makes sense.21:42.26 
kens yes exactly. It will leave a file object on the stack, which is then consumed by the pdfrun executable function21:42.41 
ray_laptop darn, I forgot the ) after the filename.pdf and kens forgot the "file" operator, but acharles, I assume you get the idea21:42.49 
kens Hey, its late here :)21:43.01 
acharles I do21:43.45 
kens thinks I'm doing well to be making any sense at all.... 21:44.16 
acharles I only first read the PLRM on Friday and I’m not used to stack based languages. But I think I’m learning fast. :P21:44.53 
kens If you only want to process PDF files, why not use MuPDF ?21:45.29 
ray_laptop acharles: so if you want to use "runpdf", if the input file is NOT PDF, it will confuse the pdf_main.ps code that is trying to open it as a PDF and won't expose you to accidentally executing PS 21:45.30 
kens I wouldn't say it will confuse it exactly, it will reject it as an invalid and unfixable PDF file21:46.16 
ray_laptop e.g., gs -c "(examples/colorcir.ps) (r) file runpdf quit"21:48.27 
  gives: Error: /syntaxerror in pdfopen21:48.43 
  acharles: well, you don't have to read ALL 912 pages -- just the first 700 or so ;-)21:49.43 
acharles Does MuPDF offer pdf compression?21:49.44 
kens I still think that if you don't need PostScript (or PCL) input, MuPDF is probably more appropriate.21:49.45 
ray_laptop acharles: yes21:49.50 
kens acharles ah, you want to *modify* the PDF files ?21:49.59 
ray_laptop acharles: but pdf output from mupdf is rather limited21:50.39 
kens was assuming rendering the PDF files was the goal 21:50.39 
acharles read an input file and create an output file that contains the same pdf, but linearized and compressed (perhaps with lower quality)21:51.03 
kens Currently Ghostscript has more options for doing that.21:51.21 
ray_laptop acharles: I don't know which (if any) mupdf can do,21:51.41 
acharles And the runpdf command gives me an error about invalid file access, which I assume is due to using -dSAFER21:51.54 
kens MuPDF can compress and linearize the file (though linearization is pointless) but I don't think it can currently d things like downsample images or subset fonts21:52.02 
  acharles yes, it will be.21:52.15 
  You only really need to worry about -dSAFER if you are using PostScript, PDF has no file operators21:52.56 
  Umm actually that's not totally true.21:53.09 
  It can link to other files.21:53.17 
  other PDF files I should say21:53.24 
  Anyway, I have to be off. GOt to go and feed the cat21:54.16 
  Goodnight all21:54.21 
acharles Night21:54.30 
  Thanks21:54.33 
ray_laptop acharles: I am in PST, so I'll be around for a while yet21:55.08 
acharles I’m also PST21:55.59 
ray_laptop acharles: -dSAFER will limit the files you can read and write to.21:56.01 
acharles Can I use -dSAFER and read from the pdf input file?21:56.48 
ray_laptop if you use -DELAYSAFER and open the input file, such as with (filename.pdf) (r) file then you can use .setsafe to go into SAFER mode before running the file (with "run" or "runpdf")21:57.31 
  the filenames named on the command line as arguments are automatically allowed in SAFER mode21:58.14 
acharles And we use -dColorImageResolution and -dSubsetFonts.21:58.14 
  So, I guess MuPDF isn’t an option21:58.25 
ray_laptop acharles: yes, those are on GS options (pdfwrite options)21:58.35 
  acharles:: the use of .setsafe is in doc/Language.htm that also discusses PermitFileReading PermitFileWriting, etc.21:59.38 
acharles Should SAFER prevent using the status command on files when processing PostScript files? (unrelated to pdf processing)22:00.40 
  so, I’m running `gs -dSAFER -dDELAYSAFER -c “(file.pdf) (r) file .setsafe runpdf” -f`22:04.28 
ray_laptop acharles: that is a question22:04.29 
acharles It seems to work.22:04.32 
  And it gives me an error if I give it a PS file.22:04.52 
ray_laptop hmm... I need to look into SAFER mode. It isn't doing what I expect (at least on Windows). I wonder if it is bitrotted22:07.18 
  this is *NOT* good22:10.39 
can-of-bees having a hard time googling this -- is there a way for ghostscript to return the version of pdf? e.g. can i feed gs a pdf and have it tell me if the pdf is pdf/a?22:12.30 
  thanks in advance22:12.40 
ray_laptop can-of-bees: not currently. It is contained in the XML Metadata, but our toolbin/pdf_info.ps doesn't currently dump any of the Metadata23:08.07 
  it is possible to write PS (or extend pdf_info.ps) to allow you to dump all or part of the Metadata23:08.44 
  The Metadata object is in the Catalog object (the document Root object from the trailer)23:10.43 
acharles ray_laptop: Did you determine if SAFER is working as intended?23:48.01 
ray_laptop acharles: haven't had a chance to look into it yet23:49.30 
  sorry23:49.34 
 Forward 1 day (to 2017/02/07)>>> 
ghostscript.com
Search: