IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2014/04/16)2014/04/17 
ray_laptop I don't know about the cut 532 code, but comparing 8.71 to 9.14, the "Tf" PDF operator uses 1,150,052 interpreter steps with 9.14 compared to 60,297 with 8.71 !!!01:11.22 
  thing is, that's only once, but it IS once per page01:12.32 
mvrhel_laptop ray_laptop: wow02:22.06 
aditsu hi, I tried to convert an xps document to pdf using gxps, and some words are overlapping; mupdf renders the xps correctly though; then I tried to convert the document to pdf using mudraw, but it failed04:33.40 
  how can I convert it correctly?04:35.10 
  huh, it seems that I can convert to svg.. then I could convert from svg to pdf page by page and join them... but there must be a better way04:36.46 
kens aditsu without seeing your XPS file there's no way we can comment on your problem. If you think you've found a bug please report it, but be aware that XPS -> PDF doesn't get a huge prioirity.07:08.15 
aditsu kens: the xps file is somewhat confidential, I'll check if I can reproduce the problem another way; also, mudraw's error was "pdf device supports only base 14 fonts currently", does that help?07:12.02 
  the file seems to use some Chinese fonts07:12.48 
  warning: not building glyph bbox table for font 'PMingLiU' with 34046 glyphs07:13.00 
kens Not really, it just means that the XPS file contains a font outside of the base 14 set, the current PDF output in MuPDF can't deal with embedded fonts I think07:13.07 
aditsu ok07:13.23 
kens A Chinese font woudl definitely not work with that code07:13.44 
  chrisl got a weird one overnight, not sure what to do about it, got a moment to chat ?07:25.19 
chrisl kens: sure, give me a minute to finish an email07:26.23 
kens No rush, I'm still poking it gingerly with a pointy stick07:26.45 
chrisl kens: Okay, mail sent......07:29.55 
kens OK bug is 695167 if you grab teh simplified file from there07:30.13 
  And I'll babble while you do.07:30.24 
  The file has a type 11 (TT outlines) CIDFont TKJEFD+ArialMT which has a single descendant also called TKJEFD+ArialMT (now there's a bad plan to start with)07:31.36 
  The descendant has a CIDToGIDMap (object 14) which has a length of 0.....07:32.02 
  buildfont11 throws a fit over that07:32.25 
  Or at least, I *think* its that07:32.35 
  The actual code line is:07:32.53 
  code = font_string_array_param(imemory, op, "CIDMap", &rcidmap);07:32.53 
  line 394 in zfcid.c07:33.04 
  I'm assuming that CIDToGIDMap gets munged into CIDMap somewhere along the way ?07:33.30 
  My first thought was to ignore the error, but the font is actually used in the course of the job (goodness knows what actually gets printed by Acrobt, I suspect nothing, or a notdef)07:34.29 
chrisl Erm, I can't remember,,,,07:34.42 
  BTW, zfcid.c only seems to have about 90 lines in it07:35.14 
kens The CIDToGIDMap is probably buried somewhere in the PDF interpreter07:35.16 
  Sorry, zfcid1.c07:35.30 
  Hmm, OK if I ignore the 'error' then the job renders07:36.21 
  And I do indeed get a TrueType /.notdef (hollow square)07:36.38 
chrisl That's not surprising07:37.00 
kens Well, I thought we might get an error further on when it tried to access the CIDMap07:37.19 
  WHat's the rune for not rendering TT notdef ?07:37.44 
  ah got it07:38.11 
chrisl I can never remember the capitalisation..... -dRenderTTNotdef, or -dRENDERTTNOTDEF07:38.44 
kens all caps07:38.48 
  Hmm, still rendering a notdef07:39.28 
  Oh, so does Acrobat :-)07:39.41 
chrisl So that goes some way to confirming my heuristic for that :-)07:40.02 
kens :-)07:40.07 
chrisl So, question is, are we rendering the "same" notdef....?07:41.00 
kens Well I can cut the file down further to see.07:41.20 
  I'll get started on that. THanks chrisl, I'll have a patch for you to comment on later.07:42.00 
chrisl I might be more informative to run the entire customer file07:42.01 
  s/I/It07:42.11 
kens I'll do that too, but let me cut it down to just the broken font.07:42.23 
  BTW the entire customer file is *huge*07:42.40 
  34 inches by 2207:42.49 
chrisl They always are from that customer07:43.13 
kens Yes, and the file *claims* to be produced byAcrobat 10.1.907:43.33 
chrisl Hmm, and probably hacked around by something else, like the last one they sent...07:44.03 
kens I'd have to guess so, yes, I don't believe Acrobat would create a file with a font with an empty CIDToGIDMap07:44.29 
  and rendering a notdef07:44.35 
chrisl I just wonder if an empty CIDToGIDMap should be replaced with an explicit identity map, or just ignored07:45.11 
kens I don't think we can tell from this file07:45.32 
  THe only glyph used in the font is FFFF which looks quite likely to be a notdef anyway07:45.50 
chrisl Well, in that case, ignoring the error seems a sane way to go07:46.28 
kens I think so, I'm going to test the returned array and see if its length is 0, if it is I'll ignore the error code and continue07:46.54 
  If we ever get a file like this which renders a real glyph we may need to revisit it07:47.11 
  Surprisingly I can't see any actual sign that this file has been meddled with after production07:47.53 
chrisl There wasn't much evidence of it in the last one either, except for *very* un-Acrobat like constructs in the contents07:53.33 
kens Yes, but in this case there's no bionkers stuff in there, its very clean07:53.52 
chrisl An invalid CID font is pretty bonkers......07:54.14 
kens I guess it depends if Acrobat thinks a 0 length map is 'legal'07:54.34 
  Its definitely seems to be the one glyph in this font that's causing the notdef07:56.00 
  I've removed all the other content and its still there07:56.10 
  And the original file now runs as well07:56.44 
chrisl Oh, it can be a name - I would *bet* Acrobat is replacing the zero length stream with /Identity07:57.35 
kens That's plausible07:57.53 
chrisl Actually, the entry is optional, so what the PDF interpreter do when it's missing? That would seem a reasonable first pass07:59.17 
kens Its optional ? Hmm07:59.38 
chrisl According to table 5.14 in the 1.7 PDFRM08:00.14 
kens I'm just removing it now08:00.26 
  OK it biold down to doing *exactly* the same as my pach :-)08:01.21 
chrisl "Default value: Identity."08:01.34 
kens if you can understand that mangled sentence08:01.36 
  I guess I could try and fix it in the PDF interpreter instead....08:02.26 
  I could drop zero length CIDToGIDMap08:02.57 
chrisl My worry is ignoring it in the C code might cause problems with Postscript Type 11 fonts08:03.17 
kens I think I'd be happier with that actually, as it means that we will throw an error yeah what you said08:03.27 
chrisl It should be easy - /processCIDToGIDMap08:03.45 
kens I'm looking at it08:03.53 
chrisl Although, once again, why the hell would you implement that in Postscript.....??08:04.37 
kens "Because we can" :-(08:04.50 
chrisl kens: How would you feel about having /resolveR store the object number in any dictionaries it retrieves?08:15.04 
kens Sounds reasonable, do you need it for something ?08:15.19 
  Have to give the key a reasonably unique name of course08:15.31 
chrisl This stuff for customer 532 - if we want to cache (even non-embedded) fonts between pages, I need a way to validate what we retrieve from the cache08:16.15 
kens OK might be useful in future for identifying different fonts with the same name also, I'm not sure I remember why we want that08:16.57 
  OK I htink I have a fix in the PDF itnerpreter, time for a cluster run08:18.25 
chrisl We have at least one bug where the file uses, IIRC, two different font subsets with the same name on the same page, and we get that wrong08:18.41 
kens That sounds familiar yes08:18.53 
chrisl Also, if we can have something other than the font name to check, we could cache even embedded fonts between pages08:19.17 
kens Which would be nice I guess08:19.31 
  Coffee, back in a minute08:19.42 
  Hmm, the cluster 'termination', ie the time take to exit after a test, seems faster since marcos reworked it.08:35.54 
chrisl As in the job timeout?08:37.59 
kens Yes, it used to 'stick' for ages after the jobs were 100% completed and all nodes idle08:38.20 
chrisl It's probably not waiting for the one minute prompt now08:39.20 
kens Well, not sure what the difference is, but I'm happy to have it :-)08:39.38 
Robin_Watts spanners: You got a mo to answer some silly picsel font questions? (please feel free to say no if you're busy/disinclined etc)09:52.34 
henrys anybody know about needing an agreement with "EMC" to use RSA? Sounds crazy.12:21.15 
kens We used RSA before at 5D I think12:21.34 
  don't remember an agreement with anyone else. Unless RSA is owned by EMCV now ?12:21.50 
Robin_Watts RSA was purchased by EMC in 2006.12:24.33 
  The RSA encryption method itself was patented, but that expired in 2000.12:26.19 
henrys but openssl uses RSA and surely they haven't need to sign an agreement with emc12:26.28 
Robin_Watts "RSA Security", the company, have produced other products.12:26.58 
  To use those you presumably need an agreement with EMC.12:27.15 
kens eg BSafe12:27.24 
Robin_Watts but to use the original RSA algorithm, why you you need an agreement?12:27.34 
  henrys: What is leading you to think that use of the RSA algorithm requires a license?12:28.02 
henrys yeah that sounds right, we have a customer that needed to sign with EMC for their Adobe RSA stuff and now they want to know if they need to do the same for MuPDF's usage of RSA12:28.44 
kens I would say no tehn12:28.59 
  I'd bet Adobe uses BSafe or something similar12:30.39 
tor8 henrys: hate to nag, but it's been 3 weeks since we got the domain... when are we going to get access to manage it?12:42.19 
henrys tor8:I'll poke him again12:42.55 
tor8 henrys: whois lists miles as the owner, and the registrar as enom.com12:43.22 
  so I'm hoping the login details to enom.com is sitting around in someones inbox.12:44.11 
henrys tor8: last time I pinged him - April 11 he said he "missed the auto-confirm" and it timed out - so he wrote back and hadn't heard anything.12:46.06 
  tor8: I sent a reminder again.12:53.41 
paulgardiner_lap tor8: does MuPDF use rsa other than for signature support?12:57.44 
tor8 paulgardiner_lap: no, all we use openssl for is what you did.13:10.13 
  we still have our own md5 and rc4 and aes functions for encrypted pdf documents.13:10.31 
  Robin_Watts: 692708 is a bummer... any suggestions for how to fix it?13:11.10 
Robin_Watts looking13:11.16 
  Presumably we are hitting errors in the parsing and ignoring them?13:12.24 
paulgardiner_lap henrys: did the customer just notice openSSL amongst our thirdparty libraries or do they need signature support? If the former, another option may be to build without it (although we have yet to make that trivial in the make files).13:12.55 
Robin_Watts IIRC the cookie allows us to set whether we ignore errors or not.13:12.56 
  and there is a count of the errors we've hit in the cookie.13:13.12 
  Possibly there is a max_errors as well where we can bale out if we hit that number?13:13.30 
tor8 Robin_Watts: looks like we're spending all our time in lex_white13:14.12 
Robin_Watts OK, so the cookie has "errors", but not "errors_max"13:14.49 
tor8 it might be necessary to add the compression bomb detection to the base stream layer13:14.49 
  (gdb) p *csi->cookie13:15.14 
  $2 = {abort = 0, progress = 1, progress_max = -1, errors = 0, incomplete_ok = 1, incomplete = 0}13:15.14 
  counting errors won't help with this file13:15.33 
Robin_Watts So, this is a file with a deliberately crafted compression bomb in it?13:16.15 
tor8 yup13:16.55 
Robin_Watts Can we improve the speed of lex_white so it doesn't matter?13:17.20 
tor8 I'm not super concerned, but it would be nice if we could fail nicely13:17.22 
  I doubt it, it's the lzw decompression underlying it that's the culprit13:17.36 
Robin_Watts So when you say the time is in lex_white, you mean it's in the lzw decompression called from lex_white ?13:18.30 
tor8 yeah. lex_white just indicates that it's reading a lot of whitespace13:18.46 
Robin_Watts Can we improve the speed of the lzw decompression? :)13:19.06 
tor8 considering that it at current takes a minute or two to decode, I doubt we can make it that much faster :)13:19.33 
  but at least it completes now13:19.42 
  we could close as WONTFIX, compression bombs just make us hang for a long time13:20.21 
Robin_Watts That would be my temptation.13:21.02 
  cos someone could construct a real file with lots of whitespace and then complain when we don't read it.13:21.23 
tor8 Robin_Watts: done.14:23.14 
ray_laptop tor8: henrys: That's the same thing Miles told me, but it didn't make sense -- the thing that was no longer valid was the verification of email address. Miles did it (once) so it went away. The name is now registered to Miles14:41.00 
  tor8: henrys: I specifically told Miles that what we needed was the login for enom.com so we could modify the stuff there (or transfer to somewhere else)14:41.54 
  tor8: henrys: He said he was going to call them.14:42.27 
henrys ray_laptop: well I sent him another message. Maybe he can hand it off to one of us, I know he is busy with other stuff.14:43.30 
ray_laptop henrys: yes, once we get the login info. But in order to give that to Miles, they probably want some personal ID (such as the credit card info he used) or something. Otherwise, I could just call and say I'm Miles.14:45.47 
ray_laptop didn't try that yet14:45.58 
henrys ray_laptop: well he should come in soon and see my mail and we'll go from there.14:47.18 
  paulgardiner_lap: was that mupdf bug from raed ever assigned I can't find the damn thing now.14:58.04 
  paulgardiner_lap: nvm I found it.15:02.48 
ray_laptop lots of folks showing up here recently. I wonder what's so interesting to most of them ? Not that I mind working with an audience :-)15:10.41 
chrisl ray_laptop: the idea of making fonts persist between PDF pages is looking like a *major* project - everything in the font handling assumes local VM :-(15:15.00 
ray_laptop chrisl: Thanks for looking. Are there any hacks that we can use for built-in fonts that we can use to avoid purging the cache for those (and find them on subsequent pages) ?15:21.17 
chrisl ray_laptop: yes, but that won't help with this job, as it doesn't use a "built-in" font, as such....15:22.07 
ray_laptop chrisl: we don't care about some of the font that may have been modified (such as Widths), just the bitmap as it has been rendered at a particular size15:22.28 
chrisl ray_laptop: the problem is, this isn't using a base-14 font15:23.18 
ray_laptop chrisl: yes, this _does_ use built-in font (or at least the font lookup machinery has selected a built-in substitute for TimesNewRomanPSMT)15:23.22 
chrisl TimesNewRomanPSMT is *not* a base 14 font15:23.47 
ray_laptop chrisl: i.e., it is NOT a font embedded in the PDF. And we map it to a base-14 equivalent15:24.18 
chrisl It's still not using a base-14 font directly, we are substituting a base-14 font for the requested font15:25.08 
ray_laptop chrisl: NimbusRomNo9L-Regu with regular gs, TimesNewRoman in the UFST casee15:25.36 
chrisl ray_laptop: the job is using TimesNewRomanPSMT15:25.55 
ray_laptop chrisl: it is _using_ a built-in15:25.57 
chrisl TimesNewRomanPSMT is **NOT** a built-in font!!15:26.14 
ray_laptop chrisl: the job requested TimesNewRomanPSMT15:26.16 
  chrisl: but after substitution, as far as the font machinery knows, we are using the UFST TimesNewRoman (or NimbusRomNo9L-Regu) -- I'm not sure it even knows what the requested fontn was15:27.17 
Robin_Watts ray_laptop: That may be true for the graphics engine. I don't know that it's true for the PDF interpreter ?15:27.57 
chrisl So we get a substituted font. When we substitute a font, we blow away the UID for the "base" font we started with - without the UID we have no way to know that the next font with that name is the *same* font with that name, so we purge it's entries from the cache15:28.11 
ray_laptop At the PS FontDir level, yes, it's been put there as TimesNewRomanPSMT, but not down in the graphics lib15:28.12 
  chrisl: so the font cache is based on the UID ?15:28.57 
  so all we have to do is keep that somehow ?15:29.15 
chrisl ray_laptop: persistence in the font cache is partially based on the UID, yes15:29.16 
  We can't keep it once we've manipulated the font15:29.34 
  We can't synthesize a small caps font, and keep the UID of the base font we started with15:30.13 
  For example15:30.23 
ray_laptop chrisl: we could keep the original UID under a different key / struct element15:30.30 
chrisl ray_laptop: that doesn't help us, we'd still risk false positives15:31.09 
ray_laptop chrisl: we don't ever synthesize a all caps or small caps font that I know of. 15:31.16 
chrisl ray_laptop: oh yes we do......15:31.28 
ray_laptop chrisl: really -- when/where does that happen ?15:31.59 
chrisl ray_laptop: pdf_font.ps line ~90015:32.34 
ray_laptop I vaguely recall logic to create fake italic using skee15:32.38 
  s/skee/skew/15:32.44 
kens Its a common enough trick, the font is awlays ugly becaus eits not italic15:33.51 
chrisl ray_laptop: this is part of the problem - these bits of font synthesis code happen in several places, and each place makes one or more copies of the font dictionary - working when and where the UID became invalid could be a nightmare.....15:34.40 
ray_laptop chrisl: what's a key I can search for -- I don't see it near line 90015:34.51 
kens Not to mention potentially breaking pdfwrite which itself makes multiple copies of fonts and merges them (sometimes)15:35.11 
ray_laptop (that's readtype1dict in my code)15:35.13 
chrisl /Flags oget 16#2000015:35.13 
ray_laptop kens: I am looking for a hack for cust 532 -- pdfwrite is NOT an issue15:35.38 
kens ray_laptop : then it beomes something special we have to maintqain :-(15:35.55 
chrisl I suppose if we just change the font name, we don't need to zap the UID.... I wonder if we copy the dictionary anywhere else relevant15:40.59 
  ray_laptop: one thing did occur to me when you pointed out how many more iterations through the interpreter loop Tf makes now......15:41.33 
ray_laptop waits anxiously ...15:41.53 
chrisl in pdf_font.ps there is code which compares the average width of glyphs in the font with the average width in the Widths array - that is probably not relevant for cust 53215:42.38 
kens What does it do that for ? Seems like an odd thing to do15:43.12 
kens expects chrisl to tell me I added it....15:43.34 
chrisl If the width of the glyphs is greater that the widths in the array, we scale the font down so the glyphs don't collide15:43.58 
ray_laptop chrisl: and we probably didn't do that in 8.7115:44.21 
kens Hmm, presumably only with substituted fonts ?15:44.22 
chrisl It was me that added it, actually, <hangs head in shame> - yes, only substituted fonts.15:44.51 
ray_laptop kens: well, we did substitute TimesNewRoman for TimesNewRomanPSMT :-)15:44.52 
kens ray_laptop : yes I know, I was thinking more generally about that code15:45.07 
chrisl ray_laptop: we did not do the width matching in 8.7115:45.07 
  It's a horrid hack to get around not having multiple-master fonts for substitution...15:46.02 
kens Ys, it seems like a reasonable solution for font substitution15:46.19 
chrisl It's just rather clunky doing it in Postscript..... but neither my testing, not the cluster should a measurable performance deficit, so it seemed reasonable.15:47.23 
  s/not/nor15:47.31 
kens Yeah but as I understand things, our vanilla code doens't show a performance hit between the two versions (8.x and 9.x) either.15:48.07 
ray_laptop chrisl: I also didn't see an overall performance difference between 8.71 and 9.14 on this file, but that was on my laptop.15:48.25 
  kens: right15:48.32 
chrisl Which is why I thought it might be something worth trying for Len15:48.45 
ray_laptop but they have a painfully slow CPU15:48.48 
  chrisl: do you have their 906 code base ?15:49.29 
chrisl ray_laptop: no, I don't15:49.39 
ray_laptop chrisl: or if I send you their pdf_font.ps, can you tell me where to change it ?15:49.58 
  or just do it in the HEAD 9.14 code and I can back port it15:50.16 
chrisl ray_laptop: can you look in the current master pdf_font.ps?15:50.23 
ray_laptop I already have it open (looking at the small-caps hack)15:50.43 
chrisl Line 777 should be a comment "% Some non-compliant files are missing FirstChar/LastChar,"15:51.01 
ray_laptop chrisl: yes15:51.14 
chrisl Good, then you can delete, comment out, whatever, everything down to line 809: "} ifelse"15:52.01 
henrys chrisl: is it possible they have a smaller amount of memory dedicated for caching - are we getting the same cache hit rate?15:52.04 
ray_laptop henrys: the cache hit rate is fine15:52.47 
kens Ray's email seemed to indicate a pretty good hit rate to me15:52.49 
chrisl henrys: it's possible, I can't remember if they change the cache size. I've yet to see evidence the glyph rendering is actually slower15:53.06 
  ray_laptop: so that code in pdf_font.ps should not have changed (probably just the exact line numbers) since the 9.06 code15:54.06 
ray_laptop chrisl: in the HEAD pdf_font.ps line 809 is a "{" following a line that has: dup length dup 0 gt15:55.14 
chrisl Sorry, misread it: line 84515:55.48 
aditsu kens: hi again, I have a test file I can provide (xps that doesn't get converted to pdf correctly), should I put it online somewhere or email it or file a bug report and attach it?15:56.01 
kens aditsu ideally open a bug report please15:56.24 
ray_laptop the ifelse corresponding to the 3 index /FirstChar knownoget test _IS_ line 845, so that makes more sense15:56.36 
aditsu alright15:56.50 
kens If the file is priavte let me know and I'll mark it so only Artifex folk can downloa dit15:56.51 
chrisl ray_laptop: sorry, I'm suffering from a bit of hayfever, and it's making my eyes itchy and watery15:57.24 
ray_laptop chrisl: they checked and the glyph rendering speed is the same as on the 871 code15:57.27 
aditsu this one is not really private15:57.32 
kens OK no problem then15:57.39 
ray_laptop my left eye has been that way for months :-/15:57.49 
Robin_Watts ray_laptop: Has that improved at all?15:58.13 
ray_laptop chrisl: thanks. I'll test it and then send it off to Len to try -- and I WILL credit (blame) you :-)15:58.35 
  Robin_Watts: just recently (in the last week or so) I've seen some improvement in the eyelid responsiveness15:59.22 
Robin_Watts excellent!15:59.30 
ray_laptop agrees wholeheartedly15:59.48 
chrisl ray_laptop: hah! Thanks (I think). As I said, given that they have multiple-master substitution, that kind of width matching shouldn't be needed in their code15:59.57 
henrys congrats ray_laptop 15:59.58 
ray_laptop henrys: congrats for what -- just waiting somewhat impatiently ? ;-)16:00.40 
  henrys: but I know what you mean16:00.56 
henrys ray_laptop: has anybody generated call counts - profile for the gs 906 with ufst? I don't see that in the emails.16:01.26 
chrisl ray_laptop, kens: How about doing something with the PDF object number to "create" a UID for a substitute font?16:03.06 
ray_laptop henrys: the AQtime (from the simulator) has call counts that are (AFAICT) trustworthy. email 4/1516:03.40 
kens chrisl I was thinking of something like that when you mentioned the resolve_object stuff this morning16:03.40 
  I'd need to think abou thte impact of that with pdfwrite though. Its 'probably' safe16:04.18 
  Back in a minute16:04.25 
chrisl kens: Yes, that wasn't how I was planning to use it, but it might a decent alternative16:04.25 
henrys ray_laptop: right but where is the profile for gs running on a host with the same job?16:04.25 
aditsu kens: bug 69516816:06.35 
ray_laptop I am looking at one large mismatch in call counts -- looks like there are 216,792 calls to zgetdeviceparams vs. 684 on 8.7116:06.40 
  henrys: I just spotted that this AM16:07.18 
chrisl Crumbs, that's probably a chunk of the time right there!16:07.59 
henrys ray_laptop: I just though profile 8.64 and 9.06 customer next to 8.64 and 9.06 host based is going to tell us what happened.16:08.31 
ray_laptop chrisl: on the simulator (which is DEBUG build, so it also includes validating refs) it accounts for 11 seconds out of 413.16:09.39 
  henrys: it's 8.71, but maybe. I have to do that on linux since profiling won't work on my laptop (neither VerySleepy or VS Performance tools)16:10.51 
  henrys: and profiling on an x86 is a LOT different to there puny CPU. I did consider profiling on the Raspberry to get closer to their performance16:11.52 
chrisl And that probably won't help much if the extra time is coming from the PDF interpreter16:12.13 
kens aditsu, OK I've reassigned it to me16:12.22 
henrys well we probably just want call counts 16:12.29 
  more reclaim?16:12.38 
ray_laptop chrisl: right, it doesn't help identifying where the difference is coming from16:12.40 
kens ray_laptop : because we (I) changed teh way that device detection works we call getdeviceparms a *lot* in the PDF interpreter16:13.05 
ray_laptop henrys: I didn't understand that 16:13.07 
aditsu thanks for checking :)16:13.22 
kens THanks for the report, I can't promise I'll get to it soon though :-(16:13.37 
chrisl kens: does pdfwrite do anything special with private XUIDs?16:13.38 
ray_laptop kens: OK. I thought we only did that once and set a flag16:14.00 
kens chrisl off the top of my head, I don't *think* so, but I can't be 100% certain, I haven't meddled with fontsrecently16:14.05 
  ray_laptop : no, we do it every time we need to know whether a given device has a particular capability16:14.22 
henrys ray_laptop: well if the slowdown is from GC, say I would expect more calls to the vmreclaim - that should be in the call count for the profile.16:14.32 
ray_laptop kens: that seems ... non optimal16:14.43 
kens ray_laptop : possibly16:14.53 
  But it means that we can change devices in the job and not get caught out16:15.06 
ray_laptop kens: seems like we could collect the things we need to know from a single getparams and set flags for the pertinent characteristice16:15.23 
  kens: how can we change devices while parsing a PDF ???16:15.48 
kens We could, yes, or we could do it the way I wanted originally, and use the device special ops16:15.55 
  ray_laptop : while we use it extensively in the PDF interpreter, we could also do it in PostScript16:16.29 
ray_laptop kens: well, I didn't think you'd need to call getparams more than once16:16.32 
  kens: for PS, I don't care as much (right now)16:16.48 
kens Well, obviously, but the code was written a *long* time ago now16:17.03 
ray_laptop kens: it's nearing the end of your day, so I'll take a look at revamping the getparams in the PDF interpreter. I'll let you know when I quit for the day if I still need help on it (so you can work while I sleep)16:18.34 
kens All the same, it seems like a lot of calls to be caused by that, but then 684 seems like a lot anyway16:18.52 
  ray its pretty easy to find all the places it gets used16:19.27 
ray_laptop kens: yeah, but on the simulator that amounts to only 0.27 sec / 489 (instead of 11 sec / 413)16:20.09 
kens Sorry don't follow the figures there16:20.34 
chrisl I wonder why we feel compelled to blow away the UniqueID when we change just the FontName......16:20.42 
ray_laptop kens: those are from their AQtime profiler results16:20.59 
kens Then they still aren't clear to me, sorry16:21.14 
chrisl I thought 9.06 was slower?16:21.44 
ray_laptop the 684 calls to getdeviceparams on 871 code base is only 0.027/489 of the time, vs. 11/413 in 906 code16:22.07 
  that's why the AQtime is only relative16:22.23 
kens So fewer calls but it takes more time ?16:22.26 
chrisl But 489 is higher than 413......16:22.41 
kens truly doesn't understand these numbers.....16:23.05 
ray_laptop chrisl: ignore that -- it includes LOTS of simulator specific code and DEBUG code and may even be run on different hosts16:23.35 
chrisl Oh, useful as ever :-(16:23.53 
ray_laptop only call counts and relative timings are useful (AFAICT)16:23.59 
  and since it's on x86 something, it doesn't exactly correspond to the target anyway16:24.35 
  which is why I didn't dive into it until looking at other sources of info to try and find out what's going on16:25.28 
  kens and chrisl: thanks for the ideas. I'm going to work on them. If you come up with anything to improve the cache usage, let me know. If you want immediate response SMS since I may not be paying attention to IRC or email16:27.28 
kens I'm off out shortly, won't be back till tomorrow morning. I'll look at the IRC log then16:28.01 
chrisl ray_laptop: I'm going to see if there's a sane way we can preserve the UID when we don't actually change the glyphs in the font....16:28.25 
ray_laptop chrisl: I just looked. Their 906 code doesn't have that small-caps stuff in it. It must have gone in later16:30.30 
chrisl ray_laptop: you mean the width matching?16:30.59 
  ray_laptop: both the width matching and the small-caps code were in the 9.06 release - I checked before mentioning it to you16:32.19 
ray_laptop chrisl: there code doesn't have the /Flags oget 16#20000 code in it16:33.18 
chrisl ray_laptop: It's definitely in the 9.06 release - I'm looking at it right now!16:33.53 
ray_laptop so either they don't have 906 or they ripped it out already16:33.54 
kens OK I have to go,night all16:34.12 
chrisl ray_laptop: what about the block above that for the Widths?16:34.22 
ray_laptop kens: g'nite16:34.25 
chrisl kens: 'nite16:34.27 
ray_laptop chrisl: checking...16:34.35 
chrisl ray_laptop: in the release code, it starts at line 726 with "3 index /FirstChar oget"16:35.39 
ray_laptop chrisl: nope. It looks like that's where they have all of their MMFont stuff16:37.23 
chrisl ray_laptop: oh well, sorry :-(16:37.41 
ray_laptop chrisl: that's OK. Thanks anyway.16:38.10 
chrisl So, I've hacked it so the UID remains, despite the metrics being changed, and I'm still getting the same number of calls to FAPI_do_char() :-(16:39.08 
ray_laptop They have captured their findfont times and those haven't changed, but I may still have to look into thei Tf differences16:39.12 
  chrisl: the purge was being invoked from 'restore' (font_finalize) so unless that is fixed, the cache will disappear16:40.12 
chrisl ray_laptop: no, there is a specific check to see if the font has a valid UID - if the UID is valid we shouldn't purge the cache, even though the font is disappearing16:41.01 
ray_laptop and we have to make sure that the cache itself is in stable memory (or non_gc memory)16:41.22 
chrisl It should at least be in global memory16:41.54 
ray_laptop because each page has a save/restore bracketing it16:41.59 
  now, we might be able to just get rid of that save/restore. I'll try that16:42.44 
chrisl No, we really can't do that!16:43.35 
  Hmm, something is still zapping the UID before we actually use the font......16:44.04 
ray_laptop chrisl: why can't we get rid of the save/restore ?16:45.26 
chrisl ray_laptop: you don't just want to keep accumulating objects in VM for the entire file16:45.58 
  ray_laptop: and I rather assume the save/restore is there for a reason, and not just for decoration!!!16:46.26 
ray_laptop chrisl: OK, right. Objects that are "resolved" and actually stored in a dict somewhere won't go away. But Fonts are about the only large thing that does that, and those ARE kept across pages16:51.25 
chrisl ray_laptop: but you'll run into the same problems as I described about putting fonts in global VM - we can't rely on the font name to distinguish between different fonts16:52.21 
ray_laptop We don't ever save images or contents , which are the other large things, so anything unused will be picked up by the GC16:52.26 
chrisl ray_laptop: with the current setup, you cannot reliably retain fonts between pages, whatever the mechanism for achieving that might be16:54.22 
ray_laptop chrisl: we don't care about the font -- just the font cache, which is always using the same base font 16:55.15 
chrisl ray_laptop: look, take my word for it, you can't do it safely.16:55.59 
  ray_laptop: we have jobs in our test suite that a built-in font on one page, and an embedded font on the next page - both with the same FontName. If you preserve the font across the page, we'll use the font loaded on the first page, and get it wrong16:58.42 
henrys ray_laptop: on linux 8.71 is much slower than 9.06 - maybe they have it backwards ;-)17:03.26 
ray_laptop chrisl: The embedded font on the next page won't have the same UID, right ?17:24.45 
  chrisl: or are you concerned with not doing the save/restore17:25.22 
chrisl ray_laptop: You don't select fonts with the UID17:25.29 
  ray_laptop: the problem is, because we're bound to Postscript we only identify fonts (in the interpreter) by font name.17:27.37 
ray_laptop chrisl: the Tf complexity probably is what you identified -- the interpreter debug I had was not for their simulator, but was for the standalone code, so it would have your Widths adjustment stuff17:28.59 
  I'm re-running the capture of the 'I' output using the 906 simulator17:29.36 
chrisl ray_laptop: the PDF font substitution isn't terribly well thought out - we never remove the original font after creating the substitute, so that precludes us from benefitting from the UID cache optimisations for substituted fonts.....17:31.51 
  ray_laptop: how general does the performance increase have to be?17:35.44 
Mikkadu Hello. I'm working on PCL driver. It seems to work on old printers, but it does not on the new one. Seems like I've to use ESC*g#W command, but I cannot find its description. Anybody knows where I can find it?17:41.32 
Robin_Watts Mikkadu: Sorry, can you be more explicit?17:58.57 
  You're working on a PCL device for ghostscript?17:59.08 
  so that ghostscript can feed PCL devices?17:59.25 
  Or are you using one of the existing PCL output devices in ghostscript and finding that there are some printers it does not drive?18:00.03 
  Or is this a question utterly unrelated to ghostscript?18:00.27 
Mikkadu Robin_Watts Ghost script work greate in my case. My boss wants me to reproduce it =( 18:03.26 
  I found bug report in gostscript18:03.55 
  related to the same issue18:04.02 
  http://bugs.ghostscript.com/show_bug.cgi?id=694082#c218:04.40 
  This one18:04.44 
  So Hin-Tak Leung 2013-06-05 14:17:13 PDT 18:05.15 
Robin_Watts So you are writing your own PCL generator, nothing to do with ghostscript.18:05.27 
Mikkadu Yees18:05.38 
  It have to ork on android 18:05.46 
Robin_Watts henrys is the PCL expert. He may be able to point you at some documentation. You'll have to wait for him to see the question.18:06.02 
Mikkadu ok thx you a lot18:06.29 
ray_laptop chrisl: sorry -- I was away. We don't need a totally general improvement. Cust 532 only runs PDF's and only to clist with a custom CMYtag device. As far as other files, we don't have any performance needs identified (yet), but we don't want to do something that might slow down other files18:11.18 
chrisl ray_laptop: I just thought that adding an explicit mapping for TimesNewRomanPSMT and TimesNewRomanPS-ItalicMT might be an option?18:12.29 
  ray_laptop: adding an explicit map in the FCOFontmap for the two fonts saves between 2.5 and 4 seconds on the two problem PDFs on my machine18:14.09 
ray_laptop chrisl: WOW. Is that with their simulator, or with regualr gs UFST build ?18:15.56 
chrisl ray_laptop: that's with a regular build on Linux18:16.25 
ray_laptop chrisl: what was the entire 50 page time ? (i.e. what percentage did it save)18:17.34 
  chrisl: and are you running -sDEVICE=ppmraw -dUseFastColor -dMaxBitmap=0 -dBandHeight=128 -o /dev/null 18:18.21 
chrisl ray_laptop: no, let me rerun the tests with those options18:18.42 
  What resolution?18:19.06 
ray_laptop chrisl: and with -Z: the rendering time can be ignored (between Outputpage start and end)18:19.13 
  chrisl: sorry: -r600 -Z: -sDEVICE=ppmraw -dUseFastColor -dMaxBitmap=0 -dBandHeight=128 -o /dev/null 18:19.42 
chrisl Okay, running now18:20.01 
  Is FinalTime good enough?18:22.11 
henrys Mikkadu: which printers are you generating output for?18:23.35 
Mikkadu HP Deskjet Ink Advantage 352518:23.58 
chrisl ray_laptop: so FinalTime "normal" is 18.9493, FinalTime with the explicit mapping is 15.4757 for WWTTN1CT_PDF_1_7.pdf18:24.42 
ray_laptop chrisl: Final time is fine (it'lll include rendering time, but so what)18:24.43 
  chrisl: WOW!!! That's >20% and more than I expected. Are you sure that the output is still correct ???18:25.54 
henrys Mikkadu: HP PCL 3 GUI not up on that as much as Hin-Tak is. but what is the problem?18:26.35 
Mikkadu So it uses uknown format of ESC*g#W 0x06 I've found description only for 0x02 =(18:26.46 
  HP's driver generates same format as Hin-Tak does.18:27.17 
chrisl ray_laptop: the output seems okay, it's using the same fonts, so it should be okay18:27.37 
  ray_laptop: the difference here is almost certainly that, because we're not using the PDF interpreter's "general purpose" font substitution, we actually get cached glyphs persisting between pages18:29.16 
marcosw_ Mikkadu: The ESC*g#W sequence is an RTL command. The only documentation I've been able to find is on this page: <http://www.undocprint.org/formats/page_description_languages/pcl>18:30.06 
henrys Mikkadu: you're ahead of me I don't have 0x02 or 0x06 or know what that means. I see Hin-Tak's patch which suggests there is 20 bytes of data for the command.18:30.41 
Mikkadu Yes and me too, but it does not match Hin-Tak's commit http://bugs.ghostscript.com/show_bug.cgi?id=694082#c218:30.48 
chrisl ray_laptop: if you can put the 9.06 sim somewhere I can download it, I'll try it properly tomorrow - I'm struggling with the hayfever now18:31.25 
Mikkadu the first byte of command if Format. I've found descriptions for format 0x02.18:33.04 
  When I print whith HP's driver it generates18:33.23 
  0x06 0x1f 0x00 0x02 0x02 0x58 0x02 0x58 0x09 0x00 0x01 0x01 0x02 0x58 0x02 0x58 0x20 0x0A 0x01 0x20 0x0118:33.28 
  So, format is 0x618:33.40 
  and length is 2018:33.47 
  According to docuemntations marcosw_ and me found on <http://www.undocprint.org/formats/page_description_languages/pcl>18:34.20 
  length can be only 8 or 24 18:34.30 
  and format can be 0x2 18:34.51 
  In CUPS it is hard coded as 0x0218:35.05 
  So, I don't know what to do and where to go please safe my sole =(((((((18:36.05 
marcosw_ I can't find this command in the official HP RTL documentation at all, otoh I'm not at home so don't have access to any of the paper copies.18:37.07 
henrys Mikkadu: who do you work for?18:37.36 
marcosw_ I did find a reference saying this command "is not supported by the HP DesignJet 4xx, 7xx, 1xxx, 2xxx, 3xxx or ColorPro printers".18:37.47 
Mikkadu hm18:38.05 
  CDC =)18:38.09 
ray_laptop chrisl: BTW, the customer simulator Tf interpreter cycles are the same between 871 and 906 (both 143980)18:38.32 
Robin_Watts Centre for Disease Control ?18:38.38 
Mikkadu =)18:38.50 
henrys and your producing PCL ?18:38.55 
  ;-)18:38.56 
Mikkadu hm...18:39.02 
  yes why not18:39.06 
  =)18:39.07 
chrisl ray_laptop: that's odd.....18:39.17 
henrys Mikkadu: industry joke sorry18:40.20 
henrys searches pcl docs18:41.34 
ray_laptop chrisl: I'll double check, but they've hacked the font lookup in both18:41.37 
Mikkadu it's 2340 and I want to go home =( Please guys if you will find something please send me email misha.zador@gmail.com =(18:42.30 
  henrys Thank you a lot 18:42.43 
ray_laptop chrisl: the count isn't EXACTLY the same. 906: 143377, 871: 14398018:43.10 
chrisl ray_laptop: as I said before, we're doing an awful lot in stopped contexts now, which aren't exactly cheap....18:43.26 
henrys Mikkadu: CDC in asia?18:44.43 
Mikkadu no18:44.53 
  in Russia )18:44.56 
ray_laptop chrisl: I'm on the hunt for the getdeviceparams now. the counts for page 1 for those are 871:17, 906:425918:45.56 
Robin_Watts chrisl: Random thought... is it possible that we could reduce some of the time taken by the stopped contexts by using some custom postscript operators?18:46.27 
ray_laptop as I was discussing earlier with kens18:46.27 
chrisl Robin_Watts: not really, they are mostly to catch errors during parsing the streams18:47.21 
ray_laptop clearly this is a customer that would REALLY benefit from a PDF parser in C !!!18:47.27 
Robin_Watts I don't know postscript well enough to have a clear view of this, but how many PS instructions does it take to setup/recover from a stopped context?18:48.07 
chrisl Robin_Watts: I'm not sure about Ghostscript's implementation - the issue more the recovery: on an error, PS just stops, so cleaning up after than can be complex18:49.22 
henrys Mikkadu: I'm not seeing anything drop a note to Hin-Tak18:50.10 
ray_laptop Robin_Watts: in terms of execution it is not much overhead (unless there was an error, which is NOT the usual case)18:50.10 
Robin_Watts chrisl: Maybe I'm talking utter rubbish here, so before I say anything else, can you point me at an example please?18:50.16 
  ray_laptop: I was thinking purely in terms of reducing the number of times we pass through the intrepreter loop.18:50.35 
Mikkadu henrys how can I do it?18:52.25 
  email?18:52.41 
henrys become a member of bugzilla and add a question to the bug18:53.16 
chrisl Robin_Watts: actually, looking through the pdf interpreter, it's not as bad as I thought....18:53.19 
Mikkadu Ok thx18:53.30 
chrisl Robin_Watts: however, I think anything like that would be effort better put to mooscript18:54.01 
Robin_Watts essentially: X becomes { X } stopped { stuff to clean up X } if ?18:54.43 
ray_laptop chrisl: both 871 and 906 execute ".internalstopped" 792 times on page 118:55.23 
  Robin_Watts: correct, except the clean up is mostly done by the interpreter18:55.47 
chrisl Yes, exactly. Generally, you have to be fairly careful cleaning up the stack(s) afterwards, but it seems less necessary in the PDF interpreter18:55.51 
ray_laptop the interpreter takes care of restoring the stacks18:56.11 
chrisl No, it doesn't18:56.21 
  At least, not always18:56.40 
Robin_Watts { X } { stuff to clean up X } .stoppedif18:56.42 
ray_laptop chrisl: right, sorry. Not always18:56.55 
  Robin_Watts: and what's the difference ?18:57.14 
Robin_Watts That would save us 1 time around the interpreter loop.18:57.32 
  but for 800 executions it's not worth it.18:57.58 
ray_laptop Robin_Watts: agreed18:58.05 
Robin_Watts I thought we were talking much higher numbers.18:58.17 
chrisl Yeh, I am surprised there's not a more significant difference between the two code bases, but there you go.....18:58.51 
ray_laptop as chrisl said, it's mostly when we run streams to handle unexpected EOF18:58.57 
Robin_Watts Also, we could maybe do some smart stuff by binding the cleanup code and then do: {X} cleanup .stoppedif18:59.11 
Latarian has anybody seen an issue printing fillable form PDFs with mac18:59.12 
  Unable to convert to postscript file18:59.19 
ray_laptop this file only has the main "Content" stream18:59.24 
Robin_Watts Less stack manipulation going on there.18:59.28 
  But again for 800 executions we're never going to gain much.19:00.21 
ray_laptop Robin_Watts: procedures get 'packed' and bound when loading the PS file.19:00.40 
Robin_Watts oh, right. didn't know that. So we'd gain nothing with the latter idea.19:01.06 
ray_laptop Robin_Watts: so the interpreter steps are "push packed array" stopped "push packed array" if19:01.31 
chrisl Okay, I'm going to have to finish - I'm struggling to focus. ray_laptop as I said, if you want me to try out the font substitution thing tomorrow, I'll need the new simulator19:02.12 
Robin_Watts chrisl: Are you working tomorrow?19:02.38 
chrisl I was planning to, yes19:02.46 
  I don't want to spend hours sat in Good Friday traffic.....19:03.17 
ray_laptop chrisl: right. I'll clean my build and send it up for you. I'll email tech with the location (in case anyone else wants it)19:03.21 
  there is traffic on Good Friday (more than usual Friday)? Here it is less19:03.52 
chrisl ray_laptop: okay, thanks.19:04.00 
  ray_laptop: yeh, not so much local, but out of town: day trippers, long weekenders etc - and I'm on one of the main routes down to the west country, where a lot of people go for those kinds of jaunts19:05.05 
mvrhel_laptop bbiab19:41.36 
ray_laptop Patch to collect device characteristics at the start of the page GREATLY reduced the number of getdeviceparams calls (factor of 200). I'll apply it to HEAD and regression test and let kens review it tomorrow21:39.50 
  I need an automated fork to dig through all the spaghetti code so I can find the meatballs :-)21:40.55 
  as I said before, this customer (532) is a perfect example of where a PDF parser in C would help a LOT !21:41.52 
  maybe tor is ready to work with ghostscript again ;-)21:42.24 
henrys ray_laptop: I must have done something wrong my 9.06 code was 2x faster than 8.7121:42.52 
ray_laptop henrys: yeah, I was thinking that. Once I get the patches applied to HEAD and a regression running, I may give it a whirl on some linux box21:44.02 
  or maybe build it for the pi21:44.15 
  henrys: note that I am only looking at the parsing times, since their target printing times were almost the same. I use -Z: output and a little awk script to collect the parsing and rendering times separately21:46.27 
  -Z: gives the Outputpage times even for a release build21:47.02 
  henrys: I just re-ran the tests on my laptop (with HEAD -- none of my "improvements") and I get 2.99 seconds for all 50 pages with 8.71 and 5.00 seconds with HEAD.22:01.40 
  The command line args I used were: -q -Z: -sDEVICE=ppmraw -o /dev/null -r600 -dUseFastColor -dBandHeight=128 -dMaxBitmap=10000 WWTN1.pdf22:02.44 
  I have to run an errand. BBIAB22:06.27 
 Forward 1 day (to 2014/04/18)>>> 
ghostscript.com
Search: