Ghostscript IRC logs

	<<<Back 1 day (to 2016/10/12)	20161013
sebras	Robin_Watts: so w is the u plane and you << a single step each time through the loop and compares this to the largest w for any component.	10:40.04
	Robin_Watts: and i is how many times you shifted.	10:40.34
	Robin_Watts: what is size?	10:40.35
Robin_Watts	sebras: yes.	10:40.36
	size allows for the fact that when you shift up you don't get exactly what you expect.	10:41.02
	i.e. if I have a U that's 129 wide, you'd naively expect the Y to be 129<<1 wide.	10:41.41
	but it might actually be (129<<1)-1 wide.	10:41.59
	or it might be (129<<2) wide, or (129<<2)-{1,2,3} wide.	10:42.32
sebras	ok, so the computation is never further from the correct max_w than the value in size.	10:42.38
Robin_Watts	Though I think my test should be w <= max_w - size	10:42.39
	or something. hmm.	10:42.39
sebras	Robin_Watts: let me put it like this. we have already computed i, and this is returned.	10:42.40
	Robin_Watts: why is it important that we throw if w < max_w - size?	10:42.41
Robin_Watts	sebras: Suppose the largest component is 256 wide.	10:42.42
sebras	Robin_Watts: are we risking stepping outside the destination samples buffer?	10:42.42
	ok, 256 samples wide.	10:42.42
Robin_Watts	we can accept smaller components of 128, 64, 32 size.	10:42.42
	The intention is that I whinge if we get one of (say) 96 wide.	10:42.42
sebras	Robin_Watts: right.	10:42.42
Robin_Watts	ok, that second test is just broken.	10:42.42
	if (w >= max_w + size)	10:42.42
	that might work better.	10:42.43
sebras	Robin_Watts: can we do while (w < max_w) and do the check and fz_throw() after the loop?	10:42.49
	right now there is no need for the trailing return i;, we can never end up there.	10:42.50
Robin_Watts	yes, nicer.	10:42.50
sebras	Robin_Watts: so if I understand this correctly: if the widest component is 256 samples, and one of the subsampled components is 64 samples wide w == 256, size == 4 and i == 2	10:42.52
	but we would also accept: a sub sampled component of 65, where w == 260, size = 4 and i == 2..?	10:42.52
Robin_Watts	Updated version with new code that's hopefully easier to follow.	10:44.21
sebras	Robin_Watts: I'm still not confident I truely understand the if-statement.	10:56.05
	so basically you determine the number of shifts necessary to get w up to at least max_w size (possibly bigger)	10:56.26
Robin_Watts	And then make sure it's not too much bigger.	10:56.52
sebras	then you say, ok. take the width of the component that is subsampled, subtract one and shift it equally far. this may not be at least a big or bigger than the widest component.	10:57.09
	why -1?	10:57.45
	note that the code might be perfectly fine, this might just be me needting to fold my brain paths correctly to understand it.	10:58.25
Robin_Watts	Ah, wait, I have a new formulation.	10:58.52
	Try now.	11:00.13
sebras	Robin_Watts: the reason for me asking for this is because I want to implement subsampling support for tiff, and I had trouble doing so.	11:00.50
	Robin_Watts: so I'm going to steal your code later.	11:01.00
Robin_Watts	sebras: I may already have that implemented for SOT.	11:01.09
	I moved the tiff stuff over to SOT.	11:01.21
	and in so doing, I had to rewrite various bits of it to avoid the need for decode_tile etc.	11:01.36
	I moved a lot of stuff to working on streams.	11:01.55
	I plan to port that back at some stage.	11:02.09
sebras	ok, so now you are saying: take the widest component, pretend to subsample that, if the that width matche that of the relevant component then we konw how many steps to subsample. but if we cannot find a subsample depth that corresponds with the widest component, error out.	11:06.09
	now I get it. :)	11:06.16
Robin_Watts	cool.	11:06.20
sebras	that probably means that the code now is inefficient. ;)	11:06.28
	given the last few days of discussions. ;)	11:06.45
	Robin_Watts: why is jpx->comps[k].data indexed by yy? this is the location where we read the data from.	11:13.00
	Robin_Watts: isn't the component data packed more tightly for the subsampled components?	11:13.18
Robin_Watts	The old code indexes using v = jpx->comps[k].data[y * w + x];	11:16.14
	The new code therefore needs to use v = jpx->comps[k].data[(y>>sw) * w + (x>>sh)];	11:16.39
	I avoid redoing lots of that calculation, by setting yy = (s<<sw) and data = &jpx->comps[k].data[yy];	11:17.24
	then I can just do: v = data[s>>sh]; for each pixel.	11:17.39
sebras	so the subsampled component values are not packed tightly inside data?	11:17.52
Robin_Watts	sebras: I don't understand.	11:18.18
sebras	Robin_Watts: I.e. it looks something like: sample nothing nothing nothing sample nothing nothing nothing sample if that component is subsampled 2 times..?	11:18.35
Robin_Watts	Noooo.	11:18.42
sebras	Robin_Watts: right, if it did then I'd be surprised.	11:18.56
Robin_Watts	x and y run to the unsampled width.	11:19.12
	(and height0	11:19.17
	so (x>>sw) runs from 0 to the subsampled width.	11:19.32
sebras	Robin_Watts: but I can't see how the contents of the subsampled component can be: sample0 sample1 sample2 and we index in the array using the sample indicies as for the widest component.	11:19.41
Robin_Watts	data[x] would be indexing as if we used the widest component.	11:20.02
	data[x>>sw] is indexing as if we use the subsampled component.	11:20.23
	Note that the old code nested the loops y, x, k.	11:20.41
	The new code nests them y, k, x.	11:20.50
sebras	yes, I know.	11:21.06
	oh! I'm so silly. >> divides, << multiplies. :)	11:21.37
	now it all makes sense.	11:21.42
Robin_Watts	:)	11:21.45
sebras	I has looking at the l2subfactor() and was blindly thinking that >> was shifting left...	11:22.14
	Robin_Watts: ok, so now it truely LGTM. :)	11:27.49
Robin_Watts	Thanks.	11:29.38
	Does the openjpeg code in gs cope with subsampled things?	12:25.27
kens	No clue	12:29.10
	Do you have an exmple ?	12:29.19
Robin_Watts	I do, I'll run a test in a mo, but I figured sebras might know.	12:33.54
	evidently it does.	12:35.02
sebras	Robin_Watts: but not in the luratech case.	12:37.05
Robin_Watts	The results in gs are awful for tests_private/pdf/sumatra/jpx_image_components_have_different_dimensions.pdf	12:38.43
	I wonder if it's subsampling all the planes to match rather than upsampling the subsampled ones?	12:38.59
sebras	Robin_Watts: using luratech or openjpeg?	12:39.25
Robin_Watts	openjpeg	12:39.36
	So the jpx image is 480x640.	12:42.33
	If I use -g480x640 -dFitToPage I should see it 1:1, right? I must be getting that wrong, cos it's cropped.	12:43.00
	-dPDFFitPage, d'oh.	12:44.12
	ok, that looks horribly subsampled.	12:44.19
sebras	Robin_Watts: can you help me an look at the log from my latest cluster push.	14:05.40
	Robin_Watts: seems like quite a number of .jp2-files are different from trunk.	14:05.59
Robin_Watts	ok...	14:06.01
	Have you pulled in my changes?	14:06.14
sebras	Robin_Watts: not if their are not on master.	14:06.26
Robin_Watts	They are on golden master	14:06.32
sebras	Robin_Watts: ok. then they are in my run.	14:06.44
Robin_Watts	No, they aren't :)	14:07.15
sebras	Robin_Watts: so then why could my clusterpush (consisting of pnm-changes and a leak-fix) claim differences in the errors from j2k-files?	14:07.17
Robin_Watts	If you look in the diffs shown at the bottom of the log, all my subsampling and ycc->rgb changes are showing as missing in your cluster code.	14:07.48
sebras	Robin_Watts: oh.	14:12.46
	Robin_Watts: tor8: up for grabs: a number of pnm commits and a leak fix on sebras/master	14:14.06
Robin_Watts	sebras: *dp++ = v ? 0 : 0xff; ?	14:15.49
	if (cond) dp++ = foo; else dp++ = bar; might be better expressed as: *dp++ = cond ? foo : bar;	14:16.58
	dunno if all compilers are smart enough to optimise the two the same.	14:17.18
	The ==255 case is by far the most common, so I don't necessarily see the aesthetic argument for not having it as the first one in the if chain.	14:18.45
	if (pnm->maxval == 255) memcpy(dp, p, pnm->width * pnm->height * img->n); ?	14:20.23
	(I know that some of that code predates you)	14:22.02
	if (pnm->maxval < 0 \|\| pnm->maxval >= 65536)	14:22.59
	I submit that pnm->maxval == 0 presents problems too :)	14:23.12
	If we get multiple TUPLTTYPE lines, should we not at least warn?	14:25.42
	Why are we strdupping tupltype rather than using an enum ?	14:26.34
*sebras*	is still working on the memcpy...	14:28.02
Robin_Watts	You've seen cmyk pnm files, with alpha?	14:28.56
sebras	Robin_Watts: I didn't see any cmyk files at all.	14:29.27
	also creating files using alpha is quite difficult.	14:29.41
	but doable.	14:29.43
Robin_Watts	Then I don't understand 32eeb29c2091	14:29.49
sebras	Robin_Watts: I haven't been looking for them but they are in the spec.	14:30.13
	so I created one from scratch.	14:30.31
Robin_Watts	This code all suffers from a pet peeve of mine in that we use foo->bar as an upper loop bound, and then do pointer stuff in the loop.	14:32.15
	int n = img->n; for (k =0; k < n; k++) .... is better IMHO than for (k=0; k < img->n; k++) ...	14:33.30
	cos the moment you hit any memory in the .... bit, the compiler has to assume that img->n may have changed, and so loads it again.	14:34.02
	and in the last commit I disapprove of the changes to fz_drop_glyph_cache_context	14:37.13
sebras	TUPLTYPE is specified weirdly in PAM. the spec says you should concatenate all TUPLTYPE lines, but doesn't really say what happens with the samples if you have TUPLTYPE RGB\nTUPLTYPE GRAYSCALE\n. in what order do the samples come? R G B gr? maybe only gr (i.e. the 1st line is ignored)? maybe only R G B (i.e. the 2nd line is ignored)	14:38.55
Robin_Watts	sebras: Wow, broken specs R us.	14:39.38
sebras	also netpbm itself checks for TUPLTYPE == "RGB" e.g. so...	14:39.59
Robin_Watts	I can only imagine:	14:40.18
sebras	maybe we should just error out if we see those..?	14:40.22
Robin_Watts	TUPLETYPE R	14:40.44
	TUPLETYPE G	14:40.45
	TUPLETYPE B	14:40.47
	to get 'RGB' maybe?	14:40.56
	but in the absence of examples I'd error out, I think.	14:41.06
sebras	Robin_Watts: if I also make tupltype an enum the risk of leaks vanish.	14:42.46
Robin_Watts	sebras: yeah.	14:43.07
	and it's not like we can actually cope with any types other than the fixed (enummed) set we know about.	14:44.00
	i.e. we can't hold onto the TUPLTYPE in order to parrot it out again later.	14:44.27
sebras	Robin_Watts: sebras/master updated according to review.	16:33.18
	Robin_Watts: including fz_drop_glyph_cache() which may now be to your liking..?	16:33.29
	you never really said what was wrong, but I took a guess.	16:33.41
Robin_Watts	unnecessary locking.	16:33.51
sebras	Robin_Watts: I also converted the header token parsing to use the enum stuff.	16:36.07
	I anticipated you would comment on that next.	16:36.13
	and I think all the -> in for loops are gond.	16:36.23
Robin_Watts	sebras: Will look in a mo, just in the middle of talking to mvrhel.	16:36.34
sebras	np.	16:36.48
Robin_Watts	right.	16:53.42
	sebras: First commit looks lovely now.	16:55.00
	We could improve it further by having loops that count down rather than up, but let's not hold it up for that.	16:55.30
	second commit is good too.	16:56.47
	The use of memcmp in the third worries me a bit.	16:57.41
	Why memcmp, not strcmp ?	16:57.46
	Suppose s points to "A" (i.e. a 1 char string).	16:58.03
	memcmp("A", "WIDTH", 5) will overrun the "A" buffer	16:58.39
	strncmp would be better ?	16:58.49
	strncmp(s, tokens[i].str, len) == 0 && isspace(s[len]) ?	16:59.36
sebras	because the input buffer is not necessarily nul-terminated.	17:00.14
Robin_Watts	ok, so strncmp, not strcmp.	17:00.39
sebras	oh.	17:00.47
	ok.	17:00.48
Robin_Watts	strncmp stops on a 0, or after the maximum length, I think.	17:00.59
sebras	yes.	17:01.11
Robin_Watts	and we should check that the char after the matching token is whitespace.	17:01.26
sebras	makes sense.	17:02.00
Robin_Watts	(but otherwise, a nice looking commit)	17:02.23
	same comment for the next commit.	17:02.30
sebras	hm.. wait. we already know that the char after the token is white, eol or eof.	17:03.24
Robin_Watts	We do?	17:03.36
sebras	because I while on iswhiteeol() to find the end of the token.	17:03.36
Robin_Watts	sebras: Ah. If you use strncmp you don't need that.	17:04.10
	You skip any whitespace you find, then strncmp looking for matches. If you match, you know the length from the match.	17:04.51
	You miss the nice optimisation about checking len == len though that way...	17:05.20
	yeah, either way is fine.	17:05.40
	I still don't quite understand the premultiply alpha one.	17:05.56
	but all the others look good.	17:06.30
sebras	hm.. isn't it a problem to use strncmp() when the strings I'm comparing the input to are prefixes of each other?	17:06.39
	i.e. BLACKANDWHITE is a prefix of BLACKANDWHITE_ALPHA	17:07.04
Robin_Watts	not if you continue to check your lengths first.	17:07.05
	so your counting of the len and then checking len == len solves that.	17:07.27
	so your way wins :)	17:07.37
sebras	it's rare, but it happens!	17:08.00
Robin_Watts	The last commit looks like it changes what happens in fz_drop_font_context.	17:09.02
sebras	Robin_Watts: why are we dropping the ctx->font->symbol before the &ctx->font->ctx_refs == 0 ?	17:09.57
Robin_Watts	I think your new formulation may be correct though :)	17:10.13
sebras	ok.	17:10.18
	* Only drop font context objects when dropping the last reference.	17:10.27
	this is in the commit message.	17:10.30
Robin_Watts	I wish I understood why it was written the other way.	17:10.41
sebras	mistake?	17:10.58
Robin_Watts	Ah, it was a commit of tors, and you know how you can't trust his code :)	17:11.51
	Yeah, I suspect it was a slip.	17:12.02
sebras	Robin_Watts: commit 52ed5832db9ea4bcc9dcfd2f0391af89f5aadd95 introduces the dropping of things before the call to fz_drop_imp()	17:12.02
Robin_Watts	So lgtm.	17:12.14
sebras	/* CMYK is a subtractive colorspace, we want additive for premul alpha */	17:12.57
	this is what load-tiff says before doing premultiply.	17:13.07
	this is the reason that many image loaders convert cmyk to rgb before calling premultiply.	17:13.51
Robin_Watts	Ah!	17:14.02
	Copy that same comment into your commit, and I'm happy.	17:14.36
sebras	Robin_Watts: new version up (and being clustered)	17:15.43
Robin_Watts	All lgtm now.	17:17.30
sebras	Robin_Watts: one more fetch please.	17:20.03
	Robin_Watts: didn't notice gcc complaint about types.	17:21.29
	we have too many warnings when recompling... :-(	17:21.38
Robin_Watts	sebras: We don't have that many do we? Mainly in thirdparty stuff?	17:22.35
	MSVC is warning free in the non third-party stuff (except for 2 in memento)	17:22.53
	what changed? :)	17:23.18
sebras	Robin_Watts: (char *) when passing buffer to strncmp()	17:23.32
Robin_Watts	oic. OK.	17:23.54
	lgtm.	17:23.59
sebras	I get two warnings in load-jpx (presumably because of the new code drop), one in gl-input.c and 107 warnings in in pdf-cmap-table.c and one in curl (thirdparty, but not compiled with the thirdparty stuff for reasons unknown).	17:26.25
	Robin_Watts: I think 726.j2k and p0_13.j2k are indeterminisms with openjpeg.	17:27.14
Robin_Watts	curl is outside our control.	17:27.19
	sebras: ooh.	17:27.27
sebras	Robin_Watts: what did you find?	17:27.57
Robin_Watts	was commenting on indeterminisms.	17:28.28
sebras	ah.	17:28.52
	Robin_Watts: ok, then I push the pnm-stuff and the context dropping.	17:32.15
Robin_Watts	cool.	17:34.18
sebras	Robin_Watts: tor8: why do we not have OPENSSL in Makethird?	18:01.51
	are we conciously deciding not to make it easy to compile mupdf using OPENSSL?	18:02.23
Robin_Watts	sebras: the openssl stuff is too large and hairy to have as a thirdparty dependency.	18:05.56
	Having the build rules in the Makefile in case someone DOES import it would be good though.	18:06.19
	paul did the fiddling with that though.	18:07.10
sebras	Robin_Watts: yeah, and I was looking at a bug report concerning pauls code.	18:07.27
	and discovered that I have never built mupdf using openssl despite having it installed on my system.	18:07.45
	maybe we want to avoid using the system's openssl too..?	18:07.54
	Robin_Watts: http://git.ghostscript.com/?p=user/sebras/mupdf.git;a=commitdiff;h=65059bf99be52e80f282b75cd43c0108d8f91616 this makes me compile using my system's openssl all the time.	18:08.59
	Robin_Watts: but maybe we don't want to make that the default?	18:09.25
Robin_Watts	I'm going to defer to tor8 on this.	18:14.06
tor8	sebras: Robin_Watts: because openssl has a hairy build system and cannot be built by simply compiling the source files	22:59.43
	openssl does not exist on macosx	23:00.29
	Forward 1 day (to 2016/10/14)>>>

Log of #ghostscript at irc.freenode.net.