Ghostscript IRC logs

Log of #ghostscript at irc.freenode.net.

	<<<Back 1 day (to 2015/11/08)	20151109
kens	Hmm so a problem that doesn't exhibit with a public device, only with a custom one, but its a bug in Ghostscript ? Well maybe.....	08:58.36
	chrisl I just finished up a bug, and was about to take a quick poke at Zoltan's one, unless you are already lookign at it ?	10:38.24
chrisl	kens: I'm not looking at it, no....	10:39.17
kens	OK I'll take a quick stab then	10:39.29
	Though it loks more like Ray's area.....	10:39.37
chrisl	TBH, my first thought was that somewhere we don't protect against the band size reaching zero.....	10:41.12
kens	Its possible, I'll have to start off by modifying the makefile to build in the device, and tehn I'll look at it. I'm a bit puzzled that a regular device does not exhibit the same problem though, suspicious to my mind, but we'll see....	10:42.02
chrisl	I should have said somone isn't protecting against the band size reaching zero	10:43.07
kens	:-)	10:43.16
	I'll make a start on it, even if it's Ray's I may be able to save him some time	10:43.40
	Well the first thing I notice is the his device includes its own create_compositor method, I wouldn't be at all surprised if that's the culprit right there.	10:55.39
	Looks like Zoltan's problem is that at a low enough resolution (2200 dpi) a pattern does not get written as a clist. At a higher resolutoin, it does. And if its a clist, it (possibly) doesn't work. He could probably increase some threshold somewhere to avoid the clist.	11:23.20
chrisl	But the pattern clist should work....	11:24.38
kens	Agreed, but it doesn't (or if it does it takes a stupendously long time)	11:25.01
	Just an observation at the moment	11:25.10
	I'm going to reduce the PDF file, it decompresses to a mere 42MB	11:25.31
	Well... it looks like pattern clists are just abominably slow to be honest	12:08.15
chrisl	Hmm, that's not good :-(	12:09.27
kens	The pattern itself is quite complex, so I reduced it to a simpler fill. That now runs 'quicker' but its still terribly slow	12:09.59
chrisl	Is there anything in the pattern that requires a compositor device?	12:11.13
kens	Not really, the compositor seems to be only there for overprinting. I may already have removed that	12:11.37
	I guess I need to check	12:11.45
chrisl	I just wondered if the device using its own push compositor meant we didn't use an optimisation of some sort	12:12.29
kens	Its entirely possible	12:12.38
	The answer appears to be that there is no longer anythign calling the create_compositor, when I get done with this run I'll put a break point and check for certain	12:14.07
	It dos set a custom halftone but that appears to be all	12:15.50
	OK so there are now no calls to create_compositor, that's a definite red herring. I've removed all the offending GStates	12:23.27
	Simplifying the pattern improves the speed, but its stil horrendously slow	12:23.53
	I thnk thsi is one for Ray to profile, it just seems to me tht pattern clists are very, very slow	12:24.15
chrisl	So, running the file with, say, psdcmyk doesn't trigger the same problem?	12:25.09
kens	Haven't tried yet, I was about to, I was just diff'ing the very slow and excrucitingly slow files to make sure I hadn't done somethign dumb	12:25.44
	OK tmie to try a different device.	12:26.19
	Other devices use tile_colored_fill instead of tile_pattern_clist, I don't know why yet. However that route is (comparatively) fast. I used psdcmyk in order to get the separations, even though I don't thnk we're actually setting any. However tiff24nc behaved exactly the same.	12:30.01
	I need to find out where the method gets set. Going to grab some lunch quickly now though	12:30.39
	So, not exactly a Ghostscript bug. But the performance of pattern clists does look like a worry.	14:22.53
Robin_Watts	kens: Did you figure out what caused it to use the pattern clist rather than the tile_colored_fill?	14:24.10
kens	Yes I put it in the bug, the customer's device has a bit depth of 64 instead of hte normal amximum of 32	14:24.39
	This means we need a bitmap tile twice the size to hold the rendered pattern	14:24.54
	Since they didn't increase the MaxPatternBitmapSize, it exceeded that and went to clists	14:25.18
	Double the size (to match the bit depth) and the problem goes away	14:25.33
Robin_Watts	kens: Ah, sorry, hadn't see the bug.	14:26.59
kens	Yeah ti can take time for hte mail to arrive	14:27.19
	I'm juist testing a standard device (tiff24nc) with -dMaxPatternBitmap=3,000,000 and I see the same problem. The normal size is 10,000,000	14:28.09
	So I'm pretty sure ths is hte root of the performance drop	14:28.32
	Whether its reasonable I'm less sure. We are ending up drawing a lot of pattern tiles by laboriously executing a (quite complex) pattern over and over, but it does seem slow all the same, which is why I ahven't closed it, but given it to Ray	14:29.19
	Tiff turned out to be a bad choice, cvan't write a file that big :-)	14:30.33
	I guess the fact that the cusomter has so many planes may slow it down some as well	14:30.58
henrys	funny when the code crashes in the typical ghostscript macro, in gdb you do mac expand to see what the macro does and then say to your self, well okay I need a different approach to debugging this ...	15:14.27
	mac expand WRITE_UNALIGNED(WRITE_OR, WRITE_OR_MASKED)	15:14.30
	expands to: bits = (case_right ? ((skew) < 8 ? (((((const bits16 )(bptr))[0]) >> (skew)) & right_masks2[skew]) + ((((const bits16 )(bptr))[0]) << (cskew)) : ((bits16)(const byte )(bptr) << (cskew)) & 0xff00) : ((cskew) < 8 ? (((((const bits16 )(bptr - ((int)(sizeof(bits16)))))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 )(bptr - ((int)(sizeof(bits16)))))[0]) >> (skew)) + (((bits16)(((const byte *)(bptr -	15:14.30
	((int)(sizeof(bits16)))))[2]) << (cskew)) & 0xff00) : ((((((const bits16 )(bptr - ((int)(sizeof(bits16)))))[0]) & 0xff00) >> (skew)) & 0xff) + (((((const bits16 )(bptr - ((int)(sizeof(bits16)))))[1]) >> (skew)) & right_masks2[skew]) + ((((const bits16 )(bptr - ((int)(sizeof(bits16)))))[1]) << (cskew)))); ((bits16 )dbptr)[0] \|= (((bits) ^ invert) & mask); while ( count >= (((int)(sizeof(bits16)))*8) ) { bits = ((cskew) < 8 ?	15:14.30
	(((((const bits16 )(bptr))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 )(bptr))[0]) >> (skew)) + (((bits16)(((const byte )(bptr))[2]) << (cskew)) & 0xff00) : ((((((const bits16 )(bptr))[0]) & 0xff00) >> (skew)) & 0xff) + (((((const bits16 )(bptr))[1]) >> (skew)) & right_masks2[skew]) + ((((const bits16 )(bptr))[1]) << (cskew))); bptr += ((int)(sizeof(bits16))); dbptr += ((int)(sizeof(bits16))); ((bits16 )dbptr) \|=	15:14.33
	((bits) ^ invert); count -= (((int)(sizeof(bits16)))8); } if ( count > 0 ) { bits = ((cskew) < 8 ? (((((const bits16 )(bptr))[0]) << (cskew)) & left_masks2[cskew]) + ((((const bits16 )(bptr))[0]) >> (skew)) : (((((const bits16 )(bptr))[0]) & 0xff00) >> (skew)) & 0xff); if ( count > skew ) bits += ((skew) < 8 ? (((((const bits16 )(bptr + ((int)(sizeof(bits16)))))[0]) >> (skew)) & right_masks2[skew]) + ((((const bits16 )(bptr +	15:14.33
	((int)(sizeof(bits16)))))[0]) << (cskew)) : ((bits16)(const byte )(bptr + ((int)(sizeof(bits16)))) << (cskew)) & 0xff00); ((bits16 *)dbptr)[1] \|= (((bits) ^ invert) & rmask); }	15:14.36
sebras	henrys: beautiful! :)	15:16.48
henrys	that bug is clear as day now...	15:17.38
Robin_Watts	but, to be fair, it probably compiles down to 3 instructions :)	15:27.51
henrys	Robin_Watts: right I should just generate assembly and debug that.	15:28.21
Robin_Watts	You want something that only partially expands the macros.	15:28.57
	Or allows for the simplifications allowed by the constant values at that point.	15:29.25
chrisl	Or sane macros.....	15:29.47
henrys	I studied this ages ago and there was some reason it didn't fit nicely into an inline function.	15:30.02
	but I don't remember what the reason was	15:30.20
chrisl	Possibly because the macro takes macros as parameters	15:30.46
	Actually, if you do macro expand WRITE_UNALIGNED(a, b) it should expand just the first level	15:38.26
henrys	chrisl: no it seems to be expanding everything for me.	15:42.38
kens	manual expansion in the code	15:43.11
chrisl	Sorry, what I meant was it'll not expand WRITE_OR and WRITE_OR_MASKED	15:44.13
henrys	chrisl: With the work I was doing it was easy to get the old language switch working with a separate instance for ps/pdf and I'm tripping over this which is really stinking of a global somewhere. Like the font cache maybe?	15:48.14
chrisl	The font cache isn't in a global	15:49.17
henrys	chrisl: no device sharing everything is separate. But I might have screwed up somewhere.	15:49.22
chrisl	henrys: does it happen with all/most devices?	15:53.06
henrys	chrisl: btw I'm just babbling I can work on it. No display device works ljet4 does not ...	15:53.47
	chrisl: ppmraw is wrong - any mono device seems to trip it up	15:55.41
	sorry ppmraw is good ...	15:55.50
chrisl	Well, I'd have to guess that's in the rendering code - nothing in the font caching code should be different for contone vs mono	15:57.20
henrys	can we write PS and PCL front ends for mupdf add support to MuPDF for high level devices and move on? ;-) I guess not.	15:57.59
chrisl	Given that we can't even tidy up the ghostscript apis......	15:59.45
henrys	chrisl: a lot of it is to do with PS, I get it... but ghostscript kicks you at every turn, I can't believe jaws was as difficult to change.	16:02.33
chrisl	henrys: no it wasn't. But Jaws also got a hefty revision internally and APIs every five years or so - tossing out the crap, the fixing the hacks	16:03.44
	Jaws made no pretence about maintaining compatibility back to 1989......	16:05.28
mvrhel_laptop	I am going to be out most of this morning. Need to pick up my father at the airport	16:44.30
	Forward 1 day (to 2015/11/10)>>>

IRC Logs

Log of #ghostscript at irc.freenode.net.