IRC Logs

Log of #ghostscript at irc.freenode.net.

Search:
 <<<Back 1 day (to 2015/11/08)20151109 
kens Hmm so a problem that doesn't exhibit with a public device, only with a custom one, but its a bug in Ghostscript ? Well maybe.....08:58.36 
  chrisl I just finished up a bug, and was about to take a quick poke at Zoltan's one, unless you are already lookign at it ?10:38.24 
chrisl kens: I'm not looking at it, no....10:39.17 
kens OK I'll take a quick stab then10:39.29 
  Though it loks more like Ray's area.....10:39.37 
chrisl TBH, my first thought was that somewhere we don't protect against the band size reaching zero.....10:41.12 
kens Its possible, I'll have to start off by modifying the makefile to build in the device, and tehn I'll look at it. I'm a bit puzzled that a regular device does not exhibit the same problem though, suspicious to my mind, but we'll see....10:42.02 
chrisl I should have said *somone* isn't protecting against the band size reaching zero10:43.07 
kens :-)10:43.16 
  I'll make a start on it, even if it's Ray's I may be able to save him some time10:43.40 
  Well the first thing I notice is the his device includes its own create_compositor method, I wouldn't be at all surprised if that's the culprit right there.10:55.39 
  Looks like Zoltan's problem is that at a low enough resolution (2200 dpi) a pattern does not get written as a clist. At a higher resolutoin, it does. And if its a clist, it (possibly) doesn't work. He could probably increase some threshold somewhere to avoid the clist.11:23.20 
chrisl But the pattern clist should work....11:24.38 
kens Agreed, but it doesn't (or if it does it takes a stupendously long time)11:25.01 
  Just an observation at the moment11:25.10 
  I'm going to reduce the PDF file, it decompresses to a mere 42MB11:25.31 
  Well... it looks like pattern clists are just abominably slow to be honest12:08.15 
chrisl Hmm, that's not good :-(12:09.27 
kens The pattern itself is quite complex, so I reduced it to a simpler fill. That now runs 'quicker' but its still terribly slow12:09.59 
chrisl Is there anything in the pattern that requires a compositor device?12:11.13 
kens Not really, the compositor seems to be only there for overprinting. I may already have removed that12:11.37 
  I guess I need to check12:11.45 
chrisl I just wondered if the device using its own push compositor meant we didn't use an optimisation of some sort12:12.29 
kens Its entirely possible12:12.38 
  The answer appears to be that there is no longer anythign calling the create_compositor, when I get done with this run I'll put a break point and check for certain12:14.07 
  It dos set a custom halftone but that appears to be all12:15.50 
  OK so there are now no calls to create_compositor, that's a definite red herring. I've removed all the offending GStates12:23.27 
  Simplifying the pattern improves the speed, but its stil horrendously slow12:23.53 
  I thnk thsi is one for Ray to profile, it just seems to me tht pattern clists are very, very slow12:24.15 
chrisl So, running the file with, say, psdcmyk doesn't trigger the same problem?12:25.09 
kens Haven't tried yet, I was about to, I was just diff'ing the very slow and excrucitingly slow files to make sure I hadn't done somethign dumb12:25.44 
  OK tmie to try a different device.12:26.19 
  Other devices use tile_colored_fill instead of tile_pattern_clist, I don't know why yet. However that route is (comparatively) fast. I used psdcmyk in order to get the separations, even though I don't thnk we're actually setting any. However tiff24nc behaved exactly the same.12:30.01 
  I need to find out where the method gets set. Going to grab some lunch quickly now though12:30.39 
  So, not exactly a Ghostscript bug. But the performance of pattern clists does look like a worry.14:22.53 
Robin_Watts kens: Did you figure out what caused it to use the pattern clist rather than the tile_colored_fill?14:24.10 
kens Yes I put it in the bug, the customer's device has a bit depth of 64 instead of hte normal amximum of 3214:24.39 
  This means we need a bitmap tile twice the size to hold the rendered pattern14:24.54 
  Since they didn't increase the MaxPatternBitmapSize, it exceeded that and went to clists14:25.18 
  Double the size (to match the bit depth) and the problem goes away14:25.33 
Robin_Watts kens: Ah, sorry, hadn't see the bug.14:26.59 
kens Yeah ti can take time for hte mail to arrive14:27.19 
  I'm juist testing a standard device (tiff24nc) with -dMaxPatternBitmap=3,000,000 and I see the same problem. The normal size is 10,000,00014:28.09 
  So I'm pretty sure ths is hte root of the performance drop14:28.32 
  Whether its reasonable I'm less sure. We are ending up drawing a lot of pattern tiles by laboriously executing a (quite complex) pattern over and over, but it does seem slow all the same, which is why I ahven't closed it, but given it to Ray14:29.19 
  Tiff turned out to be a bad choice, cvan't write a file that big :-)14:30.33 
  I guess the fact that the cusomter has so many planes may slow it down some as well14:30.58 
henrys funny when the code crashes in the typical ghostscript macro, in gdb you do mac expand to see what the macro does and then say to your self, well okay I need a different approach to debugging this ...15:14.27 
  mac expand WRITE_UNALIGNED(WRITE_OR, WRITE_OR_MASKED)15:14.30 
  expands to: bits = (case_right ? ((skew) < 8 ? (((((const bits16 *)(bptr))[0]) >> (skew)) & right_masks2[skew]) + ((((const bits16 *)(bptr))[0]) << (cskew)) : ((bits16)*(const byte *)(bptr) << (cskew)) & 0xff00) : ((cskew) < 8 ? (((((const bits16 *)(bptr - ((int)(sizeof(bits16)))))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 *)(bptr - ((int)(sizeof(bits16)))))[0]) >> (skew)) + (((bits16)(((const byte *)(bptr -15:14.30 
  ((int)(sizeof(bits16)))))[2]) << (cskew)) & 0xff00) : ((((((const bits16 *)(bptr - ((int)(sizeof(bits16)))))[0]) & 0xff00) >> (skew)) & 0xff) + (((((const bits16 *)(bptr - ((int)(sizeof(bits16)))))[1]) >> (skew)) & right_masks2[skew]) + ((((const bits16 *)(bptr - ((int)(sizeof(bits16)))))[1]) << (cskew)))); ((bits16 *)dbptr)[0] |= (((bits) ^ invert) & mask); while ( count >= (((int)(sizeof(bits16)))*8) ) { bits = ((cskew) < 8 ?15:14.30 
  (((((const bits16 *)(bptr))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 *)(bptr))[0]) >> (skew)) + (((bits16)(((const byte *)(bptr))[2]) << (cskew)) & 0xff00) : ((((((const bits16 *)(bptr))[0]) & 0xff00) >> (skew)) & 0xff) + (((((const bits16 *)(bptr))[1]) >> (skew)) & right_masks2[skew]) + ((((const bits16 *)(bptr))[1]) << (cskew))); bptr += ((int)(sizeof(bits16))); dbptr += ((int)(sizeof(bits16))); *((bits16 *)dbptr) |=15:14.33 
  ((bits) ^ invert); count -= (((int)(sizeof(bits16)))*8); } if ( count > 0 ) { bits = ((cskew) < 8 ? (((((const bits16 *)(bptr))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 *)(bptr))[0]) >> (skew)) : (((((const bits16 *)(bptr))[0]) & 0xff00) >> (skew)) & 0xff); if ( count > skew ) bits += ((skew) < 8 ? (((((const bits16 *)(bptr + ((int)(sizeof(bits16)))))[0]) >> (skew)) & right_masks2[skew]) + ((((const bits16 *)(bptr +15:14.33 
  ((int)(sizeof(bits16)))))[0]) << (cskew)) : ((bits16)*(const byte *)(bptr + ((int)(sizeof(bits16)))) << (cskew)) & 0xff00); ((bits16 *)dbptr)[1] |= (((bits) ^ invert) & rmask); }15:14.36 
sebras henrys: beautiful! :)15:16.48 
henrys that bug is clear as day now...15:17.38 
Robin_Watts but, to be fair, it probably compiles down to 3 instructions :)15:27.51 
henrys Robin_Watts: right I should just generate assembly and debug that.15:28.21 
Robin_Watts You want something that only partially expands the macros.15:28.57 
  Or allows for the simplifications allowed by the constant values at that point.15:29.25 
chrisl Or sane macros.....15:29.47 
henrys I studied this ages ago and there was some reason it didn't fit nicely into an inline function.15:30.02 
  but I don't remember what the reason was15:30.20 
chrisl Possibly because the macro takes macros as parameters15:30.46 
  Actually, if you do macro expand WRITE_UNALIGNED(a, b) it should expand just the first level15:38.26 
henrys chrisl: no it seems to be expanding everything for me.15:42.38 
kens manual expansion in the code15:43.11 
chrisl Sorry, what I meant was it'll not expand WRITE_OR and WRITE_OR_MASKED15:44.13 
henrys chrisl: With the work I was doing it was easy to get the old language switch working with a separate instance for ps/pdf and I'm tripping over this which is really stinking of a global somewhere. Like the font cache maybe?15:48.14 
chrisl The font cache isn't in a global15:49.17 
henrys chrisl: no device sharing everything is separate. But I might have screwed up somewhere.15:49.22 
chrisl henrys: does it happen with all/most devices?15:53.06 
henrys chrisl: btw I'm just babbling I can work on it. No display device works ljet4 does not ...15:53.47 
  chrisl: ppmraw is wrong - any mono device seems to trip it up15:55.41 
  sorry ppmraw is good ...15:55.50 
chrisl Well, I'd have to guess that's in the rendering code - nothing in the font caching code should be different for contone vs mono15:57.20 
henrys can we write PS and PCL front ends for mupdf add support to MuPDF for high level devices and move on? ;-) I guess not.15:57.59 
chrisl Given that we can't even tidy up the ghostscript apis......15:59.45 
henrys chrisl: a lot of it is to do with PS, I get it... but ghostscript kicks you at every turn, I can't believe jaws was as difficult to change.16:02.33 
chrisl henrys: no it wasn't. But Jaws also got a *hefty* revision internally and APIs every five years or so - tossing out the crap, the fixing the hacks16:03.44 
  Jaws made no pretence about maintaining compatibility back to 1989......16:05.28 
mvrhel_laptop I am going to be out most of this morning. Need to pick up my father at the airport16:44.30 
 Forward 1 day (to 2015/11/10)>>> 
ghostscript.com
Search: