Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2018/08/01)20180802 
rnissl Hi kens, I once again have a question regarding pdfwrite ;-)07:50.29 
kens OK07:51.10 
rnissl Is there a way to tell ghostscript that the same pice of postscript code appears on every page, hence pdfwrite could reuse the object of page 1?07:51.37 
kens Yes, a PostScript form07:51.48 
rnissl is this meanwhile supported?07:51.59 
kens Note that pdfwrite won't reuse PDF form XObjects07:52.03 
  pdfwrite will embed a PostScript form as a PDF Form XObject and will use that once instead of including it each time it is used07:52.31 
  But if the input is PDF, then we don't do the same trick with PDF Form XObjects07:52.51 
  Because while PostScript programs generally use Forms sensibly, PDF files generally do not07:53.09 
rnissl the last time I read about PS form support in ghostscript, the documentation mentioned that it will simply render the content each time.07:53.44 
kens That's for rendering07:53.56 
  pdfwrite isn't a rendering device. It may also be that I added the support since you last read teh dox07:54.24 
  The code to use forms has been present for more than 3 years, because I see a bug fix against it in February 201507:55.26 
  That commit improves performance by not rerunning the PaintProc when a form is a duplicate of an earlier one07:56.01 
  http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=17131c7a7fdf9d8537c4715e538c49b29c8945a807:56.24 
  That reduced an 80MB output file to a 4MB output file and reduced teh time from 20 minutes to 16 seconds07:56.56 
kens fetches coffee08:04.33 
rnissl thanks kens, I'll give PS form a try :-)08:13.51 
kens OK08:14.01 
rnissl bye08:14.05 
kens bb08:14.09 
  chrisl so I'm looking at this text problem with pdfwrite13:25.55 
  Its braodly similar to the problem you fixed before13:26.04 
  The font has a FontMatrix of [0.001 0 0 0.001 0 0]13:26.17 
  And a FontBBox of [0 0 2 1]13:26.24 
chrisl Oh, the disappearing text issue - right13:26.44 
kens In this case though, the glyph is only partially drawn, because its clipped13:26.46 
  Because the estimated size of the glyph lies completely outside the clip, we drop it13:27.02 
  Because the estimate is, of course, miniscule13:27.12 
  I see no easy solution to this13:27.20 
  We don't want to ignore the clip, becaue then we'd emit text that is totally invisible, inflating the size of the PDF13:27.42 
  We don't want to run the CharString because that would be expensive13:28.00 
chrisl The only full solution is to run the charstring, get the "real" bbox13:28.11 
kens Yeah but I'd rather not do that, I'm not even sure how to do it from the point in the code we're at.13:28.30 
chrisl If it's the same code I changed, we can do it, pretty easily, actually13:28.56 
kens Really ? Its in the same routine13:29.08 
chrisl My first solution was to do that13:29.09 
kens Oh, well I guess maybe that would be best if you have a way to do that.13:29.22 
  It'll be expensive but my only other solution is a hack13:29.31 
  To look at the FontBBox product with the FotnMatrix, and if its tiny, use the inverse of the FotnMatrix for the FontBBox13:29.53 
  Getting the real width would be better, if slower.13:30.13 
chrisl I take it, in this case, the glyphs are out of the top, or bottom of the clip?13:30.40 
kens Top, bottom, left right13:30.51 
  Generally 'outside'13:31.00 
chrisl Left and right, we should be using the advance width of the glyph13:31.32 
kens The left side is clipped away13:31.46 
  Which means that the code thinks the entire glyph is clipped out13:32.01 
  Because the size is too small13:32.07 
  The advance width is 0 in this case, its an xyshow13:32.20 
chrisl Stupid, stupid.....13:32.34 
kens You should see what else the EPS does.....13:32.49 
  It draws one glyph 3 times, apparently just to make it look really ugly13:33.06 
chrisl Nice!13:33.14 
kens Every single one of the charpath or show operations for that glyph lies outside teh clip path as well13:33.43 
  Because the 'size' of the glyph is so small13:33.56 
  Crap fotn used in a crap way13:34.08 
chrisl Of course, I need to remember how to run the charstring.... <sigh>13:37.10 
kens Well you can probably figure it out faster than me, I'm clueless.....13:37.27 
chrisl Do you have a cut down test file?13:37.47 
kens Yes, *very* much smaller, I'll mail it13:37.59 
chrisl Thanks13:38.06 
kens On its way13:39.22 
chrisl Hmm, no sign yet.....13:41.34 
kens ah its stuck, 1 sec13:41.59 
  Gone now13:42.10 
chrisl Got it13:42.28 
kens That's a relief13:42.39 
chrisl Do I need to use -dEPSFitPage ?13:43.33 
kens -dEPSCrop13:43.42 
  I haven't got round to translating the page13:43.50 
  -dEPSFitPage probably works too13:44.10 
  need coffee brb13:45.44 
chrisl kens: so, doing this: http://git.ghostscript.com/?p=user/chrisl/ghostpdl.git;a=commitdiff;h=c0aabc06a131def424c3e881df54162399e868c714:00.10 
  Means the glyph_info call runs the charstring, gets the path, and gives a real bbox for the glyph14:00.33 
kens Really ? Huh well there you go, I'd assumed it would be much harder than that14:00.46 
chrisl Obviously, the current code in process_text_estimate_bbox() doesn't *use* the bbox, so that needs adding14:01.00 
kens Right I can do that, thanks14:01.13 
chrisl The other thing is, I probably have more GLYPH_INFO_* options in there than I actually need, but the way that gs_type1_glyph_info() handles those is.... well, baffling, to me14:02.30 
kens I've no idea what most of those do14:02.49 
  I'd 'guess' the 'PIECES' ones are for compound glyphs14:03.01 
chrisl In a Type 1?14:03.18 
kens Yeah you know SEAC glyphs14:03.27 
chrisl I guess...14:03.48 
kens Couldn't remember the term at first14:03.53 
  The rest of it, I've no idea14:03.59 
chrisl The point is, with all the options, we'll run the charstring once all the way through, then again, dropping out early after the hsbw operator14:04.41 
kens That should allow me to get rid of the degenerate FontBBox test too14:04.45 
chrisl If we can work out the right combination to avoid the "width_members" conditional, but keep the "default_members", we'll only run the charstring once14:05.28 
kens That would be handly, let me try it like this and then poke it with a sharp stick to see if I can get what I want14:05.53 
  I think I did work through all this a long time ago for a similar problem14:06.17 
chrisl Actually, with the real bbox, we don't care about the "width" do we?14:07.47 
kens I suspect not no14:07.58 
  There looks to be a lot of possible simplification in this code with the real glyph bbox14:08.15 
chrisl Given we just need the bbox: http://git.ghostscript.com/?p=user/chrisl/ghostpdl.git;a=commitdiff;h=dfd125010cfe5664f0164585efae2d3e91fbbcf914:10.36 
kens 1 second14:10.51 
  Well a quick hack 'nearly' works14:12.07 
  The top of the 'F' is still missing, that's a vertical problem14:12.28 
  Need to look at that in a minute, of course if we only get hsbw then that won't give us the height14:13.09 
chrisl One dowside about this is that we'll run this for every clipped instance of a glyph, until we reach an unclipped one.... but I guess that shouldn't be too onerous14:18.27 
kens Clipped glyphs aren't really that common14:18.47 
  I'm screwing up the y value somewhere :-(14:19.08 
chrisl Well, there's a lot going on in there - but you are now getting a populated bbox?14:19.41 
kens Oh yes, its already much bettter14:20.32 
chrisl So, a simple matter programming then....14:20.47 
kens Yeah I just need to figure out what I've broken :-) Thanks for the assist!14:21.03 
chrisl NP, I kind of wish I'd just gone with this the last time - 20/20 hindsight etc14:21.36 
kens Well there's no real way to see this in advance I don't think, it never occured to me either14:22.03 
chrisl Yeh, it's just ironic: if I had been *lazier*, this issue would already have been fixed :-)14:23.11 
kens LOL14:23.19 
  OK screwup fixed.14:23.55 
  Ugly text now appears as expected, so that's a good fix. I just want to mimic it in the /.notdef case, and see if I can optimise away some of this.14:25.09 
  Think I'll do a cluster run first......14:25.18 
  Well that's unfortunate, lots of seg faults14:56.27 
  chrisl it looks like trying to do that bbox on a type 42 font (TrueType, its a PDF) causes a crash15:03.03 
chrisl Well, that's bad - it ought to work. Which file?15:03.43 
kens gxttfb.c, gx_ttfReader_set_font, self is NULL15:03.55 
chrisl Which test file15:04.07 
kens I was using Bug692242.pdf15:04.09 
  and the pdfwrite device obviously15:04.17 
  My debugger is giving me nonsensical results on the back trace15:15.26 
  Ah, no its not, its one fo those stupid if ( function() || function() things15:16.05 
  OK so the 'pair' returned from fm_pair has ttr = NULL15:18.01 
chrisl Yeh, I have a horrid, hacky solution for that - but the bbox is coming back as all zeros....15:18.53 
kens Oh....15:19.20 
  I could try applying this only when the FontType is 1 or 315:19.36 
chrisl Um, that might be just one glyph though, the next one looks sensible....15:19.58 
kens And fallback to the FontBBox otherwise15:19.59 
  Oh!15:20.06 
chrisl kens: So, when I said hacky.... the change in base/gstype42.c:15:21.49 
  http://git.ghostscript.com/?p=user/chrisl/ghostpdl.git;a=commitdiff;h=3d9fb7afbf6ac2d6365be6ed90a6485d9555624815:21.53 
kens Well, I have no real clue what's going on there, its all font'y15:22.29 
  But not crashing is good :-)15:22.39 
chrisl The gx_provide_fm_pair_attributes() creates the ttf reader, and the ttf reader font (really a hacked up Freetype 1 ttf font)15:24.04 
kens Hmm, does it return reasonable values ? I haven't actually checked the return from the info proc15:24.28 
  The PDF outptu file looks reasonable at a quick glance15:25.02 
chrisl It return values.... I'm not totally sure what would consitute "reasonable" without going through it in a *lot* more detail15:25.23 
kens Hmm the glyph bbox is all 0 for me15:25.39 
chrisl In Bug692242.pdf?15:26.00 
kens Yes15:26.04 
  Page 215:26.06 
  First time it hits that piece of code in gstype42.c15:26.21 
  When I get back to process_text_estimate_bbox the info.bbox is all 015:26.45 
chrisl Right, but the second and subsequent times15:26.47 
kens Hmm, haven't checked the second hit.15:27.18 
  But regardless, I still need to check if its all 0 for any result15:27.30 
  Unless you think that all 0 is correct ? I suppose it could eb a .notdef or something15:27.45 
  Hmm yeah second and subsequent glyphs are OK15:28.36 
  maybe its just a real stupid gluyph15:28.44 
  I'll give a cluster run a bash15:28.53 
chrisl So, the first glyph used in glyph_index 0, and the size of the glyph is 0 bytes, so.....15:29.03 
  Sorry, glyph_index 115:29.10 
kens NP15:29.14 
  Dumb glyphs deserve to be treated that way15:29.27 
chrisl I think we used to throw an error for that, and I changed it to treat it as an empty glyph15:30.03 
kens Oh, OK well I'm not going to worry about it too much. Just annoying we had to trip over that first!15:30.27 
chrisl I *suspect* it's a dumb way to subset a TTF15:30.43 
kens OK well the seg faults go away, which is good. 200 diffs to look at15:50.45 
  Well Altona_Technical_v20_x4.pdf coems out wrong :-(16:03.01 
  Looks like some clipping is occuring16:03.20 
  Text is certainly disappearaing, a job for tomorrow I think16:03.52 
 Forward 1 day (to 2018/08/03)>>> 
ghostscript.com #mupdf
Search: