Ghostscript IRC logs

	<<<Back 1 day (to 2018/04/17)	20180418
JasonSilver	I have question that is ghostscript related, but may not be relevant. This one-liner isn't inserting new lines... what am I do wrong?	00:43.01
	If I wanted to insert a newline within a string sent to gs, how would I do that? This doesn't work: gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=out.pdf <(enscript <<<"First Line\nNextLine\nLast Line" -p - --no-header --font=Times25 --margins=20:20:200:0 )	00:43.06
chrisl	JasonSilver: (for the logs) That sounds like a enscript question rather than a Ghostscript one - enscript isn't us....	06:31.28
	enscript expects an actual newline character (not "\n") - so something like:	06:45.14
	echo -e "<<<First Line\nNextLine\nLast Line" \| enscript -p - --no-header --font=Times25 --margins=20:20:200:0 \| gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=out.pdf -	06:45.22
	The "echo -e" will convert the "\n" into a "real" newline	06:45.47
Robin_Watts	chrisl: Did you see the mail from me/shelly?	09:39.42
chrisl_laptop	Robin_Watts: I did see it - I was confused by it, but realised as I drove to the squash club why I was confused - I'll reply in a second	11:49.03
Robin_Watts	Cool, thanks.	11:49.30
chrisl_laptop	I was confused about why we'd ever free a struct we were unsharing.....	11:50.06
Robin_Watts	chrisl: It can only happen if the struct we are unsharing has a single reference.	11:52.47
	but then if it had a single reference, why would it be "shared" ?	11:53.21
chrisl_laptop	Yeh, it's because it also accounts for the allocator	11:53.39
Robin_Watts	I found myself disquieted by it in a way that I couldn't put my finger on.	11:53.57
	If we unshare an object, surely the point is that if it's shared, we should copy the contents of the old object into the new one?	11:54.28
	so our new object still contains the old objects contents.	11:54.54
	But I can't see that happening in the code, so maybe my expectation of what "unshare" does is wrong.	11:55.17
chrisl_laptop	The truth is that code is hopelessly inadequate, and I'd be very unhappy about relying on it anywhere (unrelated to Shelly's change)	11:56.27
Robin_Watts	chrisl_laptop: Right, but AIUI, we could take on Shelly's change now safely, and open a bug to remind us to fix the wider issues later?	11:58.20
	That would leave us no worse off (well, better off in that a leak is fixed, and we have a note to tell us to fix it)	11:59.00
chrisl_laptop	Um, well, having said okay, I reckon I've spotted a problem :-(	11:59.26
Robin_Watts	oops.	11:59.36
chrisl_laptop	Oh, no - sorry, confused by it being a macro rather than a function	12:00.04
	So, here's what I see as the problem with that code: if we unshare a struct with ref_cnt == 1 but with a different allocator, we'll decrement the reference count, free the original structure, and possibly all it's children.	12:02.11
	That's unrelated to Shelly's change	12:02.30
Robin_Watts	chrisl_laptop: Why is that a problem?	12:03.33
	The very moment you unshare something, you can't rely on the old pointer to it working anymore.	12:03.54
chrisl_laptop	Well, given that we don't actually copy the contents over, it doesn't matter now	12:04.09
Robin_Watts	all you can know is that the new pointer is valid (but the contents are undefined)	12:04.12
	Yeah.	12:04.16
chrisl_laptop	Which feels wron	12:04.24
	g	12:04.26
Robin_Watts	unshare isn't doing what I would expect from the name, but what it is doing, feels safe.	12:04.41
chrisl_laptop	Yeh, it's either not doing what it should, or it's badly named	12:05.12
Robin_Watts	or both :)	12:05.22
chrisl_laptop	:-)	12:05.31
	But none of that impacts the proposed change	12:05.40
	gsht.c: "the rc_unshare_struct macro only ensures that a unique instance of the top-level structure is created, not that any substructure references are updated."	12:14.04
	So it seems that's how it's supposed to work.....	12:14.13
kens	chrisl that was quick work	12:19.50
	both commits look fine to me, don't know if you want to wait for a go ahead from Henry	12:20.04
chrisl_laptop	kens: It wasn't terribly hard... actually removing the text_enum references is rather more (tedious) work	12:20.28
kens	Yeah I can imagine	12:20.37
	I like the change to the TEXT_NO_CACHE too, much better way to do the job	12:20.53
chrisl_laptop	Well, as I said, better to fix the API so it does what we need, rather than just go around it	12:21.36
kens	Definitely, the more that happens the more of these kinds of problems will occur	12:21.55
chrisl_laptop	I wonder if the uses of text_enum are regular and sane enough to just do some awk silliness to replace them all.....	12:22.47
kens	Be reasonable, this is Ghostscript we're talking about	12:23.03
chrisl_laptop	I trying out optimism.... it seems to be causing a lot of confusion to those around me	12:23.27
kens	chrisl has been kidnapped by aliens!	12:24.19
chrisl_laptop	Yep. that's the kind of confusion I'm getting a lot of	12:24.53
Robin_Watts	chrisl: Visual Studio: Ctrl-Shift-F, global replace.	12:27.20
	:)	12:27.22
	kens, chrisl (For the logs), and anyone else interested: Following up from the conversation yesterday, here are some scribblings about the new device method I'm pondering.	14:11.52
	https://twiki.ghostscript.com/do/view/Ghostscript/PixelPatchDeviceMethod	14:11.56
	Any comments before I spend ages diving down a blind alley would be much appreciated.	14:12.15
kens	Well, apart from teh fact that I'm against further extending the device itnerface :-)	14:12.39
	How is the graphics library going to know that teh pixels it has are in the correct format for the device ?	14:14.17
	I hate the idea of using an enumerator, the text and image ones cause immense problems	14:14.55
Robin_Watts	by "in the correct format for the device" I mean are "packed colour values".	14:14.56
kens	OK I misunderstood	14:15.06
Robin_Watts	I was unclear. I will fix that.	14:15.13
	The basic thing is that if I'm writing to an 8 bit device, I want 8 bit color values in an array. A 24 bit color device would have 24bit pixels in the array etc.	14:16.23
kens	OK, but how is the caller going to know what the device expects ?	14:16.43
Robin_Watts	dev->colorinfo	14:16.49
kens	Hmm, didn't know we had that	14:16.59
	How is this going to work with high level devices ?	14:17.12
Robin_Watts	color_info.	14:17.34
kens	Or are we never going to call this for those ?	14:17.37
Robin_Watts	kens: In exactly the way it does now.	14:17.45
kens	I don't see how color_info helps for high level devices	14:17.53
Robin_Watts	This will get called and will get broken down to rectangles.	14:17.55
kens	Hmm, well currently that ought never to happen for pdfwrite and co.	14:18.20
Robin_Watts	right.	14:18.29
	I'm open to ideas about how to do this without an enumerator.	14:18.58
kens	I'd have to think about it, I've not been paying attaention, btu the enumerators we do have cause endless trouble.	14:19.26
	For me, anyway	14:19.35
Robin_Watts	I take your point about "it would be nice not to have to extend the device interface", but unless we extend it, we'll never manage to do this.	14:19.53
kens	I'm not entirely unwilling to extend it, but I'd like to know there's a decent benfit to be had by doing so. If this is rarely going to be used, or doesn't provide much benefit when it is used, I'm unkeen on going down that road	14:20.47
Robin_Watts	For the thing that originally drove this idea (that we can't talk about in this channel), there is an "init", "data", "end" structure to it, which fits into the way we've done enumerators.	14:21.10
kens	Because I believe our existing interface is a huge barrier to anyone (including us, IMO) writing devices	14:21.16
	I'd ratehr add a singel device method with a 'call type' of init. data and end	14:21.50
Robin_Watts	kens: Yes, this would have to show a definite improvement to justify it.	14:21.55
kens	ie pixel_patch (dev, void*data, enum reason)	14:22.38
Robin_Watts	Or 3 separate device procs, equivalently.	14:22.45
kens	Or something like that.	14:22.48
	Yeah but again I'd prefer to have on device proc and a 'reason' flag or soemthing	14:23.04
Robin_Watts	But the problem with that is there is nowhere (visible) to stash other information.	14:23.05
kens	opaque pointer to a structure	14:23.19
Robin_Watts	Though I bet the "visibility" of stuff in the enumerator is your issue ?	14:23.28
kens	pass it as a void * and it can change depending on the 'reason' code	14:23.30
	No my problem with the enumerator is just that its a royal pain if you have intermeditate 'devices'	14:23.57
	I prefer a single device method because its less confusing to people	14:24.21
Robin_Watts	because the 'forward' is stored in the enumerator.	14:24.24
kens	Not exactly	14:24.33
Robin_Watts	single device methods are less clear in terms of the arguments they take.	14:24.49
kens	Yes indeed, but much easier to ignore 1 method than 3 if I don't want to implement them	14:25.05
	and teh interface looks less scary	14:25.11
	If we add 3 methods I think we'll break the 100 device methods barrier	14:25.23
	Certainly be close	14:25.29
Robin_Watts	Woo Hoo! Bonus time!	14:25.35
	(that's the way it works, right? :) )	14:25.45
kens	I'd rather it didn't :-(	14:25.53
HenryStiles	I do wonder if you'd get most of the speedup you'll get from this proposal simply using copy_color with no device interface changes. Depends on how much time is used making rects.	14:26.40
Robin_Watts	I found myself staring at gxdevcli.h this morning thinking "why can't we remove all the unused crap, and get rid of all the macros that make structure contents etc".	14:26.58
kens	I guess to find out we'd have to implement it both ways, which would be tedious and wasteful :-(	14:27.07
Robin_Watts	HenryStiles: We'd lose the ability to avoid the copy.	14:27.27
kens	Robin_Watts : at least some of that has been on the table for some time	14:27.31
	One day we might even find the time to do some of it	14:27.38
HenryStiles	Robin_Watts: right I don't know how expensive the copy actually is	14:27.44
Robin_Watts	kens: Yeah, the answer is "cos it'd break every device out there".	14:27.47
kens	I know, we'd have to fix them	14:27.56
Robin_Watts	kens: For commercial customers it'd be a pain.	14:28.24
kens	Yes, but its not like we do it often	14:28.32
	And is an argument for only doing it once	14:28.42
HenryStiles	Robin_Watts: and special banding commands to implement this? Or it will only work with high level images?	14:29.03
Robin_Watts	The clist would need to be extended to work with this, but I don't think that's hard.	14:29.25
kens	O.O	14:29.31
Robin_Watts	Currently, for high level images, the data goes into the clist, and only gets rendered/converted to rectangles in the render phase. So we'd be fine with high level images.	14:31.47
HenryStiles	right	14:31.59
Robin_Watts	For non-high level images, the data goes into the renderer and instead of producing umpteen million "change color, fill_rect" pairs, we'd put the data blocks into the clist.	14:32.36
HenryStiles	and I think we get most of the speedup we want using high level images so, maybe we could skip band commands in a first round of this	14:32.54
Robin_Watts	The first version of this would have the clist call the default one, so it'd behave exactly as before, yes.	14:33.41
	My thought was that I'd try to extract a default implementation of this by refactoring code that we already have, and hook that up.	14:36.27
	Hopefully it should be only insignificantly slower.	14:36.41
	Then I could try to optimise some common cases (in particular the 8 bit grey, 24bit rgb cases used by PCL).	14:37.13
	And that's when we'd see if this was a win or not.	14:37.53
HenryStiles	Robin_Watts: so the image code would do this inspecting the current ctm and color model? Not each language client.	14:42.35
Robin_Watts	yes.	14:42.45
	the interpreters would be unchanged.	14:42.51
HenryStiles	I'm sold :-)	14:42.57
Robin_Watts	kens: https://twiki.ghostscript.com/do/view/Ghostscript/PixelPatchDeviceMethod	15:40.05
	kens: Fourth version hopefully addresses your comments.	15:40.32
kens	I'd prefer that, but I understand others may not. Anythign that avoids enumerators gets my vote, even if it means mroe device methods.	15:40.49
	But personally I'd vote for #4	15:41.27
Robin_Watts	It's a single device method, and it's less clear as a result, but it should be doable.	15:41.32
*kens*	wonders if Chrisl is squashing	15:41.39
Robin_Watts	MOTing.	15:41.49
kens	Oh really ? I obviously didnt' read the logs	15:42.00
*kens*	smacks wrist	15:42.13
chrisl_laptop	I'm here - garage has wifi	15:42.22
kens	:-D	15:42.29
Robin_Watts	kens: it was in #mupdf :)	15:42.42
chrisl_laptop	And as the car is still 6 feet in the air, I'm guessing the MOT isn't done yet	15:42.57
kens	I forgot to read all the logs when I got back, I'm catching up now.	15:42.57
	I was too focussed on the security report....	15:43.12
chrisl_laptop	The only thought I had was further to something Robin_Watts said yesterday - have a spec_op that returns a list of callbacks to do the direct pixel writing. That would avoid a new device method for something I feel is not going to that widely used	15:45.26
kens	Butit has the same disadvantage as an enumerator, stuff happens behind our back	15:45.57
Robin_Watts	chrisl_laptop: That would break kens "no enumerators" wish.	15:45.58
	And (if this works) it'll be used a lot.	15:46.11
kens	I doubt it really matters in this case, but I'd like to stop doing enumerator or enumertor-like things	15:46.25
Robin_Watts	(for every image)	15:46.28
chrisl_laptop	Every image?	15:46.39
Robin_Watts	pretty much, I think, if I do it right.	15:47.02
	(every image to a raster device)	15:47.22
chrisl_laptop	Couldn't we just fix the existing image path to handle it, then?	15:47.50
Robin_Watts	I considered that.	15:48.17
chrisl_laptop	I thought this was a special case of integer pixel duplication largely specific to PCL	15:48.39
Robin_Watts	chrisl_laptop: That was the driving force, but it's not the general case of the problem.	15:49.03
chrisl_laptop	So, we're going to have two completely separate code paths to render images - great :-(	15:49.57
Robin_Watts	The general case of the problem is that we get image data in in various different forms, color convert etc into scanlines of color data.	15:49.57
	chrisl_laptop: No, not at all.	15:50.08
	Continuing my explanation... We then take those scanlines of color data and convert them down to rectangles to fill. And that's what passes across the device interface.	15:50.46
	What I want to do is to allow us to avoid that conversion to rectangles in as many cases as possible.	15:51.06
chrisl_laptop	I assumed that the caller in "Why not get the caller to do any scaling" was the interpreter	15:52.02
Robin_Watts	No. It's the Image rendering code.	15:52.18
	So the way stuff works currently is that we get a device call that tells us we have an image of type 'n', and we pick a routine to handle it.	15:52.25
chrisl_laptop	Right, in that case, I now undestand.....	15:52.30
	I confess, I thought that was the purpose of copy_color and co	15:52.56
Robin_Watts	these routines are different depending on formats etc, and they all end up calling fill_rectangle. I'd rather they called my routine.	15:53.20
	copy_color can't cope with non 1:1 copies.	15:53.29
	which is why those routines have to call fill_rect.	15:53.46
	Those routines could do the bresenham expansion into line buffers (for portrait stuff at least) and then call copy_color.	15:54.21
	but that's 2 copies. (1 to do the scaling, 1 in copy_color).	15:54.38
chrisl_laptop	Again, that's not what I took from the dicussion yesterday, but what you're saying makes sense	15:55.42
Robin_Watts	chrisl_laptop: I've spent the day thinking about it, and this is what I've landed on. This is developed from yesterdays discussion. hence the twiki page.	15:56.20
chrisl_laptop	Scaling/Plotting directly onto the output raster makes perfect sense	15:56.50
Robin_Watts	Morning ray_laptop.	15:56.56
	ray_laptop: Following on from yesterdays discussion: https://twiki.ghostscript.com/do/view/Ghostscript/PixelPatchDeviceMethod	15:57.25
	any thoughts etc.	15:57.27
	chrisl_laptop: Thanks.	15:57.33
ray_laptop	Robin_Watts: morning. I looked over the logs of the discussion.	15:58.14
kens	Ah, stealth mode, reading the logs before login :-)	15:58.37
*kens*	thinks maybe I shoudl do that	15:58.47
ray_laptop	1) I think we need to determine how often this could help (how much time the extra copy_color costs)	15:58.49
chrisl_laptop	Robin_Watts: The only thing to be wary of is the support for devices that don't use "conventional" sample representation - the tags devices spring to mind	15:58.58
kens	Does that mean they need to return a 'don't do this' error message ?	15:59.22
Robin_Watts	chrisl_laptop: A 24bit + tags device would appear in color_info as a 32bit device.	15:59.30
	and I believe we'd work just fine with that.	15:59.47
chrisl_laptop	Okay, a device that uses a different number of bits for the different plates	16:00.07
	Like 8 bit RGB	16:00.18
ray_laptop	2) I think having the begin_type_image call detect the cases (an existing device proc), then al we really need is to let the "smart, optimized" image code know where the raster buffer is (and its pertinent geometry)	16:00.24
Robin_Watts	but yes, there will be devices where it doesn't work, and where there is no quick "rejig the bits while we copy them" option, we'll fall back to rectangles.	16:00.24
chrisl_laptop	Yeh, that makes sense	16:00.47
Robin_Watts	ray_laptop: I timed J10.pcl last night, and at least 15% of runtime is spent in the "convert back to rectangles" code.	16:01.12
ray_laptop	not all devices benefit, so those that can could let the caller (from begin_image) know they can handle it, and provide the info (spec_op)	16:01.30
	Robin_Watts: convert back to rectangles ? from what?	16:01.52
	Robin_Watts: what is it converting from -- surely not a copy_color call	16:02.22
Robin_Watts	Ok, let me take another run at explaining (my view of) the current version, and what I propose to change.	16:02.35
	apologies if this is all obvious to people already.	16:02.43
kens	More explanation won't be a problem :-)	16:02.53
ray_laptop	Robin_Watts: first, I'd like to understand what you saw with J10	16:02.56
Robin_Watts	ray_laptop: That will fall out of the explanation.	16:03.14
ray_laptop	ok.	16:03.21
Robin_Watts	We get a device call to say "begin an image, of type n".	16:03.27
*ray_laptop*	shuts up and waits...	16:03.29
Robin_Watts	The device calls runs around trying to find one of its internal render routines that will cope with that type of image.	16:03.51
	Then we get more device calls (well, actually enumerator calls) that give us the data.	16:04.18
	That data is then passed into the render routine selected earlier.	16:04.35
	That render routine decodes the data it is given, does color conversion etc.	16:04.54
	Typically at this point we have a scanlines colour data in an array ready to do.	16:05.16
	But that scanline data is not 1:1 with the output device, so we can't copy_color it.	16:05.38
ray_laptop	still at source resolution	16:05.41
Robin_Watts	(unless it's been interpolated, yet)	16:05.54
	So instead, we run through that data converting it to rectangles and calling fill_rect for each one.	16:06.15
	Before I go any further, anyone want to disagree with that characterisation of the way things work?	16:06.54
ray_laptop	no, fine so far.	16:07.15
Robin_Watts	So my proposal is to add a new device call that takes blocks of ready color converted scanline data and scales/plots them onto the output device.	16:07.58
	The naive implementation of this will be exactly the "convert to rectangles" routines we have now.	16:08.15
	But it opens scope for devices to have optimised versions that can do it direct.	16:09.00
	And further optimisations that know about the special 1:2, 1:4 cases etc.	16:09.30
ray_laptop	once you have the color converted image data, if you know the device buffer address and geometry, you don't need a new call	16:09.44
Robin_Watts	ray_laptop: We need a device call to get the buffer address and geometry :)	16:10.08
ray_laptop	or a spec_op that mem devices (and any others that have buffers) implement	16:10.43
Robin_Watts	The key thing is that for someone to implement such an optimised plotting device, we really do not want them all to have to reimplement the image rendering.	16:11.07
	I'm proposing that we just give them the chance of reimplementing the smaller piece I describe in the twiki page.	16:12.19
ray_laptop	right. so if the spec_op call says "I can't do that" it falls through to existing method. And the checking in the begin_image can decide if it has an optimized plotter available	16:12.32
Robin_Watts	(and that reimplementation will typically be: "is this the special case I can handle? yes,do it, no, call default"	16:12.44
ray_laptop	I don't think we need a new device method if spec_op can handle it	16:13.14
Robin_Watts	Again, you're suggesting that we put the smarts into begin_image, and I don't want to do that.	16:13.15
ray_laptop	well, some of us (kens and I, at least) don't want a new device proc	16:13.46
Robin_Watts	Take the 8bit RGB case that someone (kens, chrisl?) talked about earlier.	16:14.34
ray_laptop	but I still don't understand what the 15% in J10 comes from, and how much the new device call can save	16:15.13
Robin_Watts	OK, hold on....	16:15.18
	You're just saying "instead of adding this device method, add a dso to do the same".	16:15.51
ray_laptop	Robin_Watts: yes, since the plumbing to forward spec_op calls is already in place	16:16.17
kens	Plumber has arrived, brb	16:16.26
ray_laptop	speaking of plumbing ;-)	16:16.36
Robin_Watts	I don't like that aesthetically, but yes, it's possible.	16:16.41
	So, the 15%....	16:16.51
	In J10, 15% of the time in the profile is spend in or below the image_render_color_icc_portrait function.	16:17.25
	The data that comes in is in RGB anyway, and it's got to output RGB into the device.	16:17.50
ray_laptop	Robin_Watts: so how much of THAT time is in fill_rectangle (i.e. time spent writing into the buffer)?	16:19.14
kens	Personally I think I'd vote for a new device method rather than another spec_op (yeah I know...). But I wouldn't feel wildly offended either way.	16:19.18
Robin_Watts	So a lot of the time there is going to be "detect rectangle from array data, call fill_rect"	16:19.21
HenryStiles	every single color transition in a source image scanline results in a function call to draw a rectangle, I can't imagine how this wouldn't be a significant speedup for a lot of files, at least it warrants spending the time to do the experiment.	16:20.24
ray_laptop	do you know what the scaling is ? note that fill_rectangle does multiple output lines if it is scaling up in y	16:20.39
chrisl_laptop	I'm also rather sensitive about new device methods, but I do feel there's a good case in this instance	16:20.40
*chrisl_laptop*	's car is ready, so calling it quits for the day......	16:21.42
ray_laptop	I'm just thinking that having a method that gives us the device buffer allows for other functions that want to "direct write" (shaded fills, trapezoids, anything we might want to optimize in the future)	16:22.54
Robin_Watts	ray_laptop: OK, looking at the profile from yesterday.	16:23.15
	14.57% is in image_render_color_icc_portrait.	16:23.29
	2.29% is in the function body.	16:23.40
	0.86% is in gx_default_rgb_map_rgb_color	16:23.53
	0.71% is in gx_forward_encode_color	16:24.02
	10.71% is in gx_dc_pure_fill_rectangle.	16:24.13
ray_laptop	yeah, the color mapping stuff is hideous	16:24.22
Robin_Watts	So the 10.71% + some of the 2.29% is what we are playing for.	16:24.34
kens	Good grief, its much later than I thought. I'm heading off, goodnight all.	16:24.52
Robin_Watts	Night kens.	16:24.57
ray_laptop	a lot of overhead to take a chunky RGB and return a chunky RGB	16:24.59
Robin_Watts	A better file to test would probably be a simple full page portrait RGB image.	16:25.34
ray_laptop	the 2.29% is what is using the dda's to compute rectangle size and collect the rectangle (width of like colored pixels)	16:26.09
Robin_Watts	yes.	16:26.25
ray_laptop	I'm not sure how much of the 10.7% you will be able to recover -- fill_rectangle is pretty efficient	16:27.11
	and if it isn't speeding up fill_rectangle for common devices (e.g. image24) has a BIG payoff	16:27.45
Robin_Watts	ray_laptop: Currently, for every pixel in a photo image (cos pixels are rarely the same), we do a device call.	16:28.34
ray_laptop	how many calls to fill_rect from the image_render path ?	16:28.42
Robin_Watts	Then we "fit" that rectangle to the device.	16:28.57
	Then we figure out the the offset, then we loop to fill them.	16:29.10
HenryStiles	Robin_Watts: I have a 300 dpi version of your lion picture on casper, 300lion.pcl, used it for the doubler	16:29.14
Robin_Watts	HenryStiles: Ta. (and this is not #artifex).	16:29.40
	ray_laptop: I do not have that information in the profile I had.	16:31.23
ray_laptop	Robin_Watts: I'm not arguing that having the image code be able to write the buffer directly wouldn't be better (getting rid of the fill_rect call overhead) since the buffer pointers can be maintained	16:31.42
Robin_Watts	You are proposing that we get a buffer pointer, and a raster, right?	16:32.09
ray_laptop	Robin_Watts: oh, I use gprof typically and it provides time AND call counts	16:32.14
	and a buffer height, since we might be in a band	16:32.44
Robin_Watts	gprof works by instrumenting the calls, which can affect the timings.	16:32.45
	ray_laptop: Right.	16:32.53
	but then that really relies on the color_index being the same format as the actual device.	16:33.11
	where this proposed solution does not.	16:33.39
ray_laptop	so you will still have the map color + encode color call	16:34.05
Robin_Watts	ray_laptop: Yes.	16:34.20
	(I don't think we can realistically avoid that)	16:34.36
ray_laptop	right, as long as it's AFTER the detection of like colored runs (don't want to call map/encode more often than needed)	16:35.17
Robin_Watts	(cos any ICC/decode array/colorkey stuff needs to happen in the image code).	16:35.19
	ray_laptop: hmm, yes.	16:35.53
ray_laptop	or at least skip the call, using the previous encoded color if the source color is the same (a'la ICC single element cache)	16:37.13
Robin_Watts	Gah. I hate the VS profiler at times.	16:39.29
	Some days it just decides that it won't see any symbols in your program.	16:39.44
*ray_laptop*	hates is most of the time	16:39.48
Robin_Watts	THIS IS THE SAME BINARY YOU RAN YESTERDAY!	16:39.54
	ok, so with 300lion.pcl 53.45% is in image_render_color_icc_portrait.	17:11.14
	28% is in gx_pure_fill_rectangle	17:11.24
	(13% in mem_true24_fill_rectangle)	17:12.01
	and 0.02% in memset.	17:12.06
	so the potential benefits there are reasonable.	17:12.45
HenryStiles	yeah that's consistent with what I saw profiling here.	17:14.21
Robin_Watts	ray_laptop: So, for 300lion.pcl, running the page 20 times I get 84Million calls to gx_dc_pure_fill_rectangle	17:34.21
ray_laptop	Robin_Watts: I thought it sounded too high until I noticed it was 20 times :-)	18:18.40
	Robin_Watts: are you running a profile build? I've noticed that (at least on linux pg builds) it uses gs_memset instead of system memset	18:25.01
Robin_Watts	I am running a profile build, yes.	18:25.17
ray_laptop	for some lame-o reason	18:25.23
	I have to remember to disable that hack when doing profiles	18:25.42
Robin_Watts	ray_laptop: probably because some profiler/compiler pairing was missing out detecting memset or something.	18:25.55
ray_laptop	Robin_Watts: it's in base/memory_.h -- #ifdef PROFILE	18:26.53
	I mention it because it really messes up my profiles	18:28.09
	but I don't know if the Windows profile build #defines PROFILE	18:29.27
	bbiaw. running an erand	18:29.56
mvrhel_laptop	bbiaw	19:26.35
	Forward 1 day (to 2018/04/19)>>>

Log of #ghostscript at irc.freenode.net.