MuPDF IRC logs

	<<<Back 1 day (to 2018/07/05)	20180706
tor8	Robin_Watts: if you've got time today, tor/master is ready for review	11:16.05
Robin_Watts	tor8: Sure. Let me look now.	11:16.17
	I might have picked: fz_rect_contains_point etc, but...	11:17.33
tor8	Robin_Watts: I pick the odder wording because then the words line up with the arguments :)	11:19.33
	but I gues sthe same could be made true of rect_contains_point too	11:19.53
Robin_Watts	I'd have reversed the arguments :)	11:19.56
tor8	oh. I added the functinos but I think I forgot to actually use them :)	11:21.17
Robin_Watts	tor8: ha!	11:21.25
tor8	code archaeology: this point is what triggered me, when I decided to revisit the pass rects/matrices by value or pointer	11:22.05
Robin_Watts	It's not important at all. I'm not even sure which one is really better (other than thinking _is_point_in_ is slightly awkward)	11:22.09
tor8	and that was a long enough distraction to forget why I started doing that commit :)	11:22.21
	I think I prefer rect_contains_point, that's similar to other functions we have	11:22.35
	if rect contains point, vs if is point inside rect with 'is' verb added to signify it returns a boolean	11:23.12
	but the 'is' verb is a nice signifier for returning and testing a boolean condition	11:23.35
	if fz_point_is_inside_rect maybe?	11:23.54
	no, then I prefer is_point_inside oddly enough	11:24.14
	it reads more question like	11:24.33
Robin_Watts	In the "Add fz_snap_selection function" one... it seems to be relying on their being \n or " " in the chars.	11:26.34
	(maybe not "\n", maybe that's just the init condition?)	11:26.46
tor8	Robin_Watts: the init condition when starting the for loop over a line's characters	11:27.19
Robin_Watts	Are we sure we actually get " " in the text? I guess we add that in to text according to spacing?	11:27.25
tor8	yes, we add spaces according to spacing	11:27.40
Robin_Watts	OK.	11:29.35
	In "Cleanup incomprehensible code", you have an in if (scissor) { ...} else { ...} in which both sides of the alternation end in the same command. That could be pulled out.	11:30.15
tor8	Robin_Watts: that gets cleaned up further later on too.	11:32.14
Robin_Watts	ok.	11:32.20
tor8	in the pass by value commit	11:32.45
Robin_Watts	The first and last chunks of http://git.ghostscript.com/?p=user/tor/mupdf.git;a=commitdiff;h=50c9fd363dfd2f0e169541a91b2074444adafa28 are very untorlike :)	11:33.08
tor8	Robin_Watts: that they are! :)	11:34.50
	it's just they look so unbalanced if the other blocks are a whole screenful	11:35.13
Robin_Watts	tor8: http://git.ghostscript.com/?p=user/tor/mupdf.git;a=commitdiff;h=4cda8ae62880a807a3b19d68b5407eafeea6e338	11:38.03
	It would be nice if you said what the return value was.	11:38.15
	http://git.ghostscript.com/?p=user/tor/mupdf.git;a=commitdiff;h=e792804dcfa3c8d868d58a28c421feb17dfc0b7e	11:40.40
	annot->has_new_ap = 0;	11:40.46
	It feels bad to me to have to have stuff outside the interface poke around inside the structure.	11:41.11
tor8	Robin_Watts: Yeah, I'm not entirely pleased with these.	11:41.27
Robin_Watts	Can we hide the pdf_annot structure and have accessors?	11:41.29
tor8	Robin_Watts: one thought I have is to remove the pdf_update_page altogether, and have pdf_update_annot return true if it needs rerendering (and immediately clear the flag, so it only returns true once)	11:42.06
Robin_Watts	I think the only reason we have fz_annot exposed is for pdf_annot	11:42.11
	tor8: That would be an ecumenical question. (And one for discussion with paulgardiner)	11:42.37
tor8	and then the users won't need to look at or touch has_new_ap, but will need to loop over annots after doing stuff with events or edits looking for changes	11:43.13
Robin_Watts	I fear that would lead to apps needing to keep their own list of annots that need updating, because they might not be able to update them immediately, but... I lack the overall vision of this to judge well.	11:43.40
tor8	this is why I've left it a bit open here, with the flag needing poking manually	11:44.00
	you'd do the loop when and where you're able to update immediately, I think	11:44.45
Robin_Watts	tor8: Right, but can't we have a pdf_annot_updated() function that sets the flag to 0 ?	11:45.23
tor8	or when and where you'll be stowing away and remembering stuff anyway	11:45.28
Robin_Watts	That would avoid us having to have access to the internals to poke it away.	11:45.42
paulgardiner	tor8: do you mean, rather than the app call pdf_update_page and then enumerating the annots looking at and resetting the flag; instead, just enumerate the annots calling pdf_update_annot looking at it's return result?	11:46.35
tor8	I think the big question is whether to return "yes we've changed, but I'll only tell you this once, so deal with it now" or do those as separate steps	11:46.40
	paulgardiner: yes, that's pretty much what I'm thinking.	11:47.02
paulgardiner	"pretty much" rather than "exactly" might matter.	11:47.25
tor8	paulgardiner: I'm being unintentionally vague. exactly is correct.	11:48.00
	the viewer loops over the annots calling pdf_update_annot. the function returns true if it has changed since the last call to pdf_update_annot on the same annotation.	11:48.34
	would that work for you?	11:49.12
paulgardiner	That approach, at first sight, sounds wholly better. The one worry I have is that calling pdf_update_annot on one annotation might cause a change in another. Maybe that isn't possible. If it were, then the current approach that has two enumerations might handle it whereas the new approach not.	11:49.50
tor8	paulgardiner: the javascript field changing other field cascade happens when we process the events	11:50.15
paulgardiner	tor8: okay, that was the worry. Can't say I'm certain, but it does seem like an improvement.	11:50.50
tor8	pdf_update_annot only recreates the appearance streams or toggles which ones are active in the case of up/down state changes	11:51.00
	then I'll make those changes, but I'll do them after this batch of changes or I'll end up in rebase hell :)	11:51.24
Robin_Watts	http://git.ghostscript.com/?p=user/tor/mupdf.git;a=commitdiff;h=80ab43b448c0efb2925feb6dfb90b70b62c55b91	11:52.30
	it's -> its	11:52.36
	not in state -> not in a stae	11:52.55
	not in state -> not in a state, even	11:53.00
tor8	also s/it's/its/ :)	11:53.20
Robin_Watts	?	11:54.52
tor8	I found more typos in the same commit message	11:55.04
Robin_Watts	I said the it's -> its thing already. I thought you were saying there was more than one instance :)	11:55.34
tor8	no, I'm just blind and didn't see it scroll past :)	11:55.49
Robin_Watts	Anyway, that's all I got for that set.	11:55.51
	On the whole lgtm.	11:56.00
*Robin_Watts*	lunches, bbs.	11:56.10
tor8	Robin_Watts: nothing objectionable in the pass by value commits? I'm ... surprised :)	11:56.21
	Robin_Watts: thanks!	11:56.34
Robin_Watts	tor8: I largely skimmed them.	11:56.35
tor8	they were very much mechanical, and I've clustered and fixed the bugs the cluster found, so I'm pretty confident in them	11:57.00
Robin_Watts	If it were done when âtis done, then âtwere well Iit were done quickly.	11:58.08
	tor8: Various commits on robin/master. Ignore the top 5.	14:29.12
	tor8: looks like some stray consts in your changes, and various things that are breaking the windows builds.	14:37.07
	Does muraster build for you?	14:38.09
	or mu-office-test ?	14:40.36
	and the signing stuff means anything without the appropriate libs (i.e. windows) fails too.	14:41.15
tor8	Robin_Watts: oops. looks like I broke muraster.	14:49.36
	I'm not sure we build mu-office-test with the unix makefiles	14:52.29
Robin_Watts	I have fixes incoming.	14:52.34
	OK, hopefully the stuff on robin/master should work now.	14:56.44
	(All but the last 5)	14:56.54
tor8	Robin_Watts: is the -"../MyTests/pdf_reference17.pdf",	14:59.25
	+"../MyTests/pdf.pdf",	14:59.25
	change intentional?	14:59.25
	Robin_Watts: the first three are LGTM	14:59.57
Robin_Watts	tor8: bugger. Will fix.	15:00.47
	still failing to build :(	15:00.55
tor8	mujstest also needs a fix (I have a commit for that on tor/master)	15:02.17
Robin_Watts	The mujstest fix contains a muraster one.	15:05.47
	Ok, tor8: what is on robin/master builds OK for me now.	15:09.11
	bugger. need to fix a commit message.	15:09.25
	OK, that should be good.	15:10.22
	Which of those were you happy with again?	15:10.29
tor8	still mulling over the endstream one, the others are good to go	15:13.05
Robin_Watts	The endstream one does solve the original bug (bug699308)	15:15.55
	My belief is that for stuff like flate and lzw streams (and others that have an end stream marker) the extra code will never be called.	15:16.32
	And in other cases with a valid file, the extra code will read 32 bytes from the file, spot that those are {\r}{\n}endstream and exit immediately.	15:17.47
	So I'm hoping that the overhead for valid files should be minimal.	15:17.56
tor8	I wonder if there can be a simpler implementation	15:20.37
	as I see it, if you've reached the end of the supposed stream, we always end up goto the maybe_ended where it scans for the endstream keyword	15:21.17
Robin_Watts	Is that not what happens?	15:23.26
tor8	my brain is melting trying to understand how it works...	15:24.02
Robin_Watts	if (state->remain == 0) goto maybe_ended;	15:24.28
	That's the 'if we've got to the end of the supposed stream"	15:24.38
tor8	stm->pos is not being updated in the maybe_ended code	15:24.53
Robin_Watts	That is true, will fix.	15:25.34
	if (state->term ==0) return EOF;	15:25.56
	That's "unless we've been told that this stream should be terminated with endstream etc, do the same as we always have"	15:26.18
tor8	'term' could use a more descriptive name.	15:28.01
Robin_Watts	tor8: terminated? is_pdf_stream ?	15:28.41
tor8	look_for_endstream?	15:28.51
Robin_Watts	look_for_endstream_or_at_a_push_endobj ?	15:29.05
	I could live with look_for_endstream	15:29.30
tor8	'term' reads like 'terminated'	15:29.32
	it threw me off a bit at first	15:29.40
Robin_Watts	terminator ?	15:29.52
tor8	trust_the_length	15:30.03
Robin_Watts	distrust_the_length :)	15:30.38
tor8	I also don't quite understand how it copes with the case when the 'endstream' token splits across read boundaries	15:31.00
Robin_Watts	the bottom loop will exit with n being the number of bytes up to the point at which the (possible) endtoken started.	15:31.54
tor8	why do you double the state->size each time we read data?	15:32.01
Robin_Watts	tor8: I want to start off only reading a small amount (cos for legal streams, there is no point in reading too much).	15:32.32
	but if the token doesn't turn up, I don't want to be stuck reading just a few bytes at a time, cos it'll be slow.	15:32.54
	so I gradually ramp up.	15:33.00
tor8	right, so it just ramps up until it asks to fill the whole state->buffer	15:33.40
Robin_Watts	It's an attempt to be efficient in both legal and illegal cases, yes.	15:33.47
tor8	memcmp rather than strncmp?	15:34.40
	or you use the 'be friendly to strcmp' terminator to not look at uninitialized data?	15:35.10
Robin_Watts	could do.	15:35.11
tor8	but you always make sure we have 9 bytes in there anyway, so memcmp would be better I think	15:35.37
Robin_Watts	no, memcmp would be fine, cos I am sure we won't overrun.	15:35.37
	yeah, will fix.	15:35.43
tor8	Robin_Watts: hm, memmem isn't a standard C function	15:36.10
Robin_Watts	memmem ?	15:36.21
tor8	like strstr but for arbitrary memory	15:36.30
	should state->offset really be updated to nbytes_in_buffer - extras?	15:39.46
	you always seek to the state->offset at the top	15:39.58
	so won't that be reading the 'extra' bytes twice?	15:41.41
Robin_Watts	we seek to state->offset.	15:42.00
	We have state->extras bytes in the buffer already.	15:42.11
tor8	you shift the extra bytes to the start of the buffer and continue appending bytes read from offset into the buffer	15:42.49
Robin_Watts	We end up with a total of nbytes_in_buffer bytes in the buffer.	15:42.51
	Which means we've actually read nbytes_in_buffer - state->extras bytes since state->offset	15:43.20
	so that's how far we move state->offset.	15:43.31
tor8	right.	15:44.03
	I see now what you're doing.	15:44.07
	it's ... too complicated for me :)	15:44.12
Robin_Watts	I wrote this on a plane a couple of months ago.	15:44.22
	I need to be slightly dehydrated and overly tired to get myself into the right frame of mind for it :)	15:44.46
tor8	I think the code is good to go, but I'm not going to trust my judgement :)	15:44.48
Robin_Watts	I'll run another cluster test.	15:44.56
tor8	I'm not too happy about how complicated it looks though, but that's just my usual griping	15:45.16
Robin_Watts	tor8: You know me - if it can be simplified, I'll gratefully take it.	15:45.45
tor8	musl has a nice memmem implementation we could copy	15:48.41
	that would likely be faster than our linear scan	15:50.01
throstur	no windows installer, right?	15:50.22
tor8	throstur: the mupdf windows release is a portable (no installation needed) EXE file	15:50.54
	just drop it where you want and run it	15:51.08
throstur	yes, I realize this, I was just wondering if there existed an installer anyway (because I am lazy), but the point is moot now as I've already set up a directory structure for portable apps	15:51.28
tor8	it does install a few registry keys to register itself as being able to open PDF files	15:51.53
	but no, no installer exists for it	15:52.27
throstur	what's the difference between mupdf and mupdf-gl	15:52.39
	the registry keys are installed at runtime?	15:52.45
	opengl... right, not important	15:53.16
tor8	throstur: mupdf-gl has more features, but may not run on all systems so we still ship the old GDI-based mupdf viewer as well.	15:54.02
throstur	cool	15:54.11
	is there a way to auto-refresh documents withuot sending SIGHUP?	15:54.19
tor8	if mupdf-gl works, it's the better choice	15:54.24
	'r' key reloads the document in mupdf-gl	15:55.00
throstur	OK, so in short I need to use a FileSystemWatcher wrapper if I want auto refresh	15:55.36
	fair enough	15:55.43
	can't get vi keybindings and autorefresh in one package on windows, such a shame	15:55.55
tor8	throstur: it shouldn't be too difficult to hack in a file stamp watcher if you desperately want it	15:57.03
	throstur: but it could lead to nasty surprises if you're using the (new and upcoming) annotation editing and form filling features	15:57.28
throstur	I can't justify the work for the time saved sadly	15:57.29
	relevant xkcd exists	15:57.38
tor8	yeah. hitting 'r' has worked well enough for me :)	15:57.48
Robin_Watts	tor8: If you want to fiddle with a memmem version then go for it.	16:01.20
	tor8: ok, so that clusterpush had some failures.	16:14.20
	tor8: Ok, cluster pushes with no SEGVs now.	17:00.33
	The bitmap diffs are progressions.	17:00.40
	Updated commit on robin/master	17:00.51
	I figure we can add the use of memmem as a followup if you think it's important.	17:01.10
tor8	Robin_Watts: yeah, it's not crucial now	17:04.24
	I started pulling it out, we can move over to use it later.	17:04.36
Robin_Watts	tor8: If you're happy with the latest iteration, I'll push it once the current clusterpush (which adds a cast to stop a warning) finishes.	17:05.25
tor8	unsigned int extras should maybe be size_t	17:06.02
Robin_Watts	extras <= 11 :)	17:06.18
	a size_t would be overkill.	17:06.26
tor8	fair enough	17:06.29
	state->size = END_CHECK_SIZE>>1 why right shift?	17:06.49
	oh, cause you double it before using	17:07.29
Robin_Watts	that.	17:07.35
tor8	go ahead and push that, I'll maybe tinker with it a bit next week	17:08.03
Robin_Watts	tor8: Cool.	17:08.28
	tor8: Should this be closed now? https://bugs.ghostscript.com/show_bug.cgi?id=698746	17:29.41
	it looks to me like pdf-write.c calls fmt_obj with appropriate crypt stuff to write objects.	17:30.41
	I'm thinking it was fixed by commit fd0bf575229	17:32.12
tor8	it's halfway done I think; we can preserve but not create encrypted files	17:39.38
	which is good enough for me	17:39.42
Robin_Watts	That's enough for the bug.	17:41.01
	so I've closed it.	17:41.08
	sebras: You awake?	18:58.27
*Robin_Watts*	moves to #artifex	18:58.40
	Forward 1 day (to 2018/07/07)>>>

Log of #mupdf at irc.freenode.net.