MuPDF IRC logs

	<<<Back 1 day (to 2020/12/02)	Fwd 1 day (to 2020/12/04)>>>	20201203
vtorri	hello		10:47.15
mubot	Welcome to #mupdf, the channel for MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line.		10:47.15
vtorri	does a pdf have a backrgound color ? and if yes, is it possible to get it with mupdf ?		10:47.50
artifexirc-bot	<KenSharp> Only if hte PDF file starts by colouring in the whole page		10:48.16
ator	vtorri: no, there's no such concept in PDF.		10:48.18
vtorri	thank you		10:48.26
artifexirc-bot	<KenSharp> For transparency calculations the page is assumed to be white		10:48.27
	<Robin_Watts> Urm... no.		11:48.24
	<Robin_Watts> For transparency calculations the page is assumed to have zero alpha 🙂		11:49.08
	<Robin_Watts> @ator ping!		11:49.14
	<Robin_Watts> I'm looking into why various annotation types aren't appearing after editing.		11:49.36
	<Robin_Watts> When we do pdf_update_appearance, we write the updated stuff into "local" xrefs unless the object is marked as dirty.		11:50.18
	<ator> @Robin_Watts so many incomplete edit operations that throw "Can't alter..." exceptions it's hard to try different things		11:51.32
	<Robin_Watts> And we don't set the object to be dirty on editing.		11:52.00
	<ator> we don't set 'dirty' we set annot->needs_new_ap		11:52.24
	<ator> dirty is a lower level thing for pdfwrite		11:52.37
	<Robin_Watts> We call "pdf_dirty_annot"		11:53.04
	<Robin_Watts> and that, as you say, sets needs_new_ap.		11:53.20
	<Robin_Watts> and it dirties the doc.		11:53.24
	<ator> yup. and that doesn't set the dirty flag (I was just checking this function)		11:53.27
	<Robin_Watts> but it doesn't dirty the annot->obj		11:53.38
	<ator> the annot->obj dirty flag is used to detect JS changes		11:54.06
	<Robin_Watts> We really need clear definitions in the code of what these different dirty flags do.		11:54.45
	<Robin_Watts> cos I know I'll never remember precisely.		11:54.53
	<ator> I'm in complete agreement there!		11:55.03
	<ator> see tor/undo for a tweak that fixes the ink stuff		11:56.05
	<Robin_Watts> Can we make annot->obj be set to be dirty when edited?		11:56.07
	<Robin_Watts> Ah, ok.		11:56.56
	<ator> I don't remember all the details, but that didn't work last time (which is why I had to add a needs_new_ap flag)		11:57.01
	<Robin_Watts> OK.		11:57.05
	<ator> I think we mixed up all kinds of things in the obj->dirty flag and that didn't work out too well		11:57.25
	<ator> too many things were setting it and it was not good if we clear it too early		11:57.55
	<ator> synthesizing the appearance would set it again, etc.		11:58.09
	<ator> just general fragility		11:58.19
	<Robin_Watts> right. that seems to work for me.		11:58.27
	<ator> the comment should be moved right along with the change		11:58.49
	<ator> since we need an explicit start operation for every change, I think we're going to have a big job of adding those to every annot and doc editing/property changing function		11:59.34
	<Robin_Watts> Yes.		11:59.55
	<ator> or we're gonna get random "Sorry, you can't do that Dave!" popping up every now and then :D		12:00.06
	<Robin_Watts> The other problem this throws up is pdf_get_annotation_obj		12:00.32
	<ator> I do like making the undo operations be automatic, and I see now why you make them nested because of the 'create new annot' and maybe later 'copy & paste' of annots changing multiple properties that should just be one undo step		12:01.04
	<Robin_Watts> With the local xref in, if we're going to fiddle with an annotation, we need to ensure the local xref is pushed/popped around such fiddling.		12:01.07
	<Robin_Watts> And that goes out the window if we let callers access the annotation object.		12:01.42
	<ator> Why? Do we want to use the local xref for more than just the appearance synthesis when it's missing case?		12:01.53
	<Robin_Watts> No, we only want to use it in that case.		12:02.16
	<Robin_Watts> but if an app can say "give me the object for this annotation", and that annotation happens to be one with a local xref in force, then the object they get back will be problematic unless that local_xref is pushed at the time they try to access it.		12:02.57
	<ator> oh yeah, because we update the Rect as well as the AP		12:03.51
	<ator> pdf_obj_ref has a 'doc' entry		12:04.38
	<Robin_Watts> ANYTHING accessed through that obj will end up in the local xref.		12:04.51
	<ator> but we can't really make that hold the local_xref too trivially		12:05.16
	<Robin_Watts> (I can't remember WHY we added the facility to get object)		12:05.41
	<Robin_Watts> It's something we did since 1.18		12:05.52
	<ator> you mean for the java bindings?		12:06.00
	<Robin_Watts> either for the java bindings or for C.		12:06.16
	<ator> the C stuff uses the 'obj' internally and everywhere basically for the annot property accessors		12:07.01
	<ator> we can make those accessors behave properly with the local_xref		12:07.12
	<ator> but raw use of the annot->obj could be problematic if the user wants to resolve a reference to a local object		12:07.43
	<Robin_Watts> Right.		12:08.13
	<ator> I'm trying to think of a way we can associate the local_xref with the dictionary or array or indirect reference so that resolving would be automatic without needing to push		12:08.25
	<Robin_Watts> I don't mind us internally using annot->obj		12:08.31
	<Robin_Watts> cos we can allow for that in the code.		12:08.38
	<ator> we have a 'doc' reference to them, which is used to mark the document as dirty and locate stream data		12:08.46
	<Robin_Watts> I do mind there being an external API for people to use to get it.		12:08.50
	<ator> I think the SO folks asked for it, but I can't remember exactly why. maybe @sebras knows?		12:09.35
	<Robin_Watts> I'll hunt back through git in a bit.		12:09.49
	<ator> if it's just to have a unique identifier, we could return the object number as an int/ID instead		12:09.53
	<Robin_Watts> I'm slightly confused by your patch...		12:10.03
	<ator> if that's all they use it for		12:10.05
	<Robin_Watts> Now, if needs_new_ap is clear, but js has altered stuff we no longer set local_appearance to be 0.		12:10.47
	<ator> I can't find any callers of getObject in any of our projects		12:10.51
	<ator> if JS has altered stuff, then we've triggered changes in the document which should be reflected		12:11.16
	<ator> if so, then we should set the local_appearance to 1 as well yeah		12:11.44
	<Robin_Watts> to 0.		12:11.56
	<ator> yes, right.		12:12.08
	<Robin_Watts> So I'll change the ordering of those if's.		12:12.20
	<ator> yeah. that should work.		12:12.31
	<Robin_Watts> Returning the object num is not so good.		12:13.21
	<Robin_Watts> If an object moves to the local_xref the number will change.		12:13.47
	<Robin_Watts> And if 2 annots both have local_xref objects, they'll have the same number.		12:13.59
	<ator> the annot dictionary should never live in the local_xref		12:14.24
	<ator> only stuff it holds (such as the AP and Rect properties)		12:14.33
	<ator> or do we move the entire annot to the local_xref or make a copy of it there so we can change it? I'm confusing myself.		12:15.25
	<Robin_Watts> The latter, I think.		12:15.55
	<ator> we push_local_xref when updating the appearance		12:16.12
	<ator> that means any new objects or edits to the current object move it into a new incremental section right?		12:16.27
	<ator> that means any new objects or edits to existing objects move it into a new incremental section right?		12:16.39
	<Robin_Watts> <thinking>		12:17.12
	<ator> if so, any accesses to the annot properties (or at least the ones affected by appearance synthesis, such as Rect and Matrix and AP) should be preceded with a push_local_xref		12:17.37
	<Robin_Watts> Yes.		12:18.07
	<Robin_Watts> I attempt to do that.		12:18.13
	<Robin_Watts> (whenever we run_annot, for instance)		12:18.20
	<Robin_Watts> While we have a local_xref, any new objects certainly go into it.		12:20.14
	<Robin_Watts> When we go to edit any object, we call prepare_object_for_alteration.		12:20.48
	<Robin_Watts> That previously ensured that the whole object was in the latest xref.		12:21.01
	<Robin_Watts> That previously ensured that the whole object was in the latest xref, by calling pdf_xref_ensure_incremental_object		12:21.17
	<Robin_Watts> I'm thinking that if we have a local_xref, we want the whole object to be in the local_xref, NOT in the incremental_xref.		12:22.28
	<Robin_Watts> but I can't see where that is achieved at the moment (maybe it isn't, and I should fix that)		12:22.43
	<ator> When we manually edit the annot->obj by setting a property, we want to move the whole object out of the local_xref.		12:27.57
	<Robin_Watts> We do.		12:28.16
	<Robin_Watts> and I think prepare_object_for_alteration is the right place to do that.		12:28.49
	<ator> Maybe we just shouldn't worry about it too much, the appearance related stuff should be regenerated in the global xref at the next appearance update		12:29.02
	<Robin_Watts> We do want to do that, yes.		12:29.05
	<ator> I'm thinking about the case when we have an annotation without an AP, we generate a local AP for it, then after the user edits it, we want to generate a global AP		12:29.44
	<Robin_Watts> In that case, we discard the local_xref and build the global one, I believe.		12:30.17
	<ator> shouldn't there be a delete_local_xref in the else to if (local_appearance) in pdf_update_appearance?		12:55.24
	<ator> why do you guard the dict_put and update_xobject with if (!local_appearance)?		12:56.24
	<Robin_Watts> ator: Yes, there should. It's in the version I'm editing at the moment.		12:56.49
	<Robin_Watts> let me look.		12:57.09
	<Robin_Watts> I'm sure that made sense at the time. Probably not now that I'm deep copying the object.		12:58.25
	<Robin_Watts> Give me a bit to bash on this and prepare you a new version.		12:58.41
	<ator> Ok!		12:58.50
*clam*	also bashed (yesterday) - https://repo.or.cz/llpp.git/blob/master:/ahbs - beauty that is fully functional and useful		13:18.13
malc_	ator: after getting rid of "native" target in llpp (https://repo.or.cz/llpp.git/commit/fb9db17eabfd561ad4e39180d4c27754d6beb06a) the question arose what is the build=native purported Raison d’Être in mupdf?		13:46.32
*malc_*	and now that i have shown off my awesome french skillz, it's time to go outside and buy some candy... brb		13:47.39
*malc_*	well, if not an answer, i now have, sweets		14:50.04
artifexirc-bot	<Robin_Watts> @ator we can try/catch within an always block without buggering up any existing error, can't we?		15:02.06
	<Robin_Watts> yeah. phew.		15:03.57
ator	we clobber the error code and message though		15:21.09
	see commit 8d6d78246 which adds a warning when that happens		15:21.23
malc_	ator: anything?		15:28.30
ator	malc_: it's been there for as long as I can remember		15:37.54
	it may have been for your benefit?		15:38.37
malc_	ator: my? as in		15:39.45
	let m		15:39.59
	damn HHKB i keep missing keys		15:40.08
	let me brush up my git archaeological skills		15:40.42
	ator: i disclaim all responsibility, it was Barzini(you) all along 42a328bc17dbbfa1f53e069e77b5b9fec793a32a		15:43.56
ator	yes, but why did I do it? :)		15:44.39
malc_	do you realize that asking random ruskies on the internet will not shed any light on this mystery :)		15:46.17
ator	it was a rhetorical question!		15:46.29
	no answer required, nor expected		15:46.40
	the mystery will remain mysterious		15:46.46
malc_	rhetorical questions along with sarcasm rarely survive IRC		15:47.07
	but, blaming me!		15:47.36
	that's low		15:47.43
	ator: you probably have mistaken me for polish Sumatra authors or something		15:49.32
ator	or maybe it was an experiment that lived too long		15:55.34
malc_	in which case the blame put on me had no excuse whatsoever		15:56.28
ator	you were using it, so I haven't purged it!		15:56.41
malc_	touche		15:57.51
*malc_*	gee.. do i suddenly feel important...		15:58.28
sebras	when we're writing an encrypted PDF we used to have some heuristics or something somewhere to skip over certain objects. did this include the signature object itself? or what?		17:34.12
ator	sebras: the num == opts->crypt_object_number argument to writeobject in dowriteobject in pdf-write.c		17:58.04
sebras	@ator ah!		18:00.50
	@ator I found something interesting.		18:01.08
	I added signature creation to debug pauls issue.		18:02.21
	signature creation in mupdf-gl.		18:04.05
	when I'm using that to add a new signature widget tht works fine.		18:04.15
	at that point the signatture dictionary contains no value.		18:04.41
	later one I click the widget an ign it with a certificate I generated myself.		18:04.59
	at this point the signature dictionray has a value added where wher have /Contents<00000000000...>		18:05.28
	thats ok because when we save the file later we're computing the digest and patch it in into /contents.		18:07.06
	but first we need to save all the objects. when we do so we're encrypting the 000000000000.... value from above, even though we know we will later overwrite this.		18:08.08
	and the encrypted /Contents is longer than the original 000000... value.		18:09.05
	I'm surprised that the encryption can _extend_ strings..?		18:15.18
	oh, it might be the initia random string when doing AES encryption....		18:17.36
artifexirc-bot	<ator> yeah, the block headers in AES encryption		18:47.33
sebras	no the iv string.		18:49.50
	@ator this caused a later problem because when we complete the signatures I compared the length of /Contents in memory with the length of the string in the saved file.		18:50.45
	and of course they differed.		18:50.48
	because of the iv string.		18:50.52
	so now I'm updating pdf_write_digest() such that it writes 0x00 over any bytes in /Contents<string> not occupied by the digest itself.		18:51.31
	if we were to not encrypt /Content<string> for signature objects that might be another approach.		18:52.02
	I'm not sure whether that is correct or not yet.		18:52.15
artifexirc-bot	<ator> I don't know the right solution either. will padding with 0 be correct? I would have assumed we should close the string and scan for > (end of string) and overwrite with space or something		19:01.34
sebras	that's another option.		19:02.55
artifexirc-bot	<ator> we could also do somethnig similar to is_xml_metadata to detect signature objects and not encrypt the Contents key		19:03.11
sebras	but in that case we need to move the delimiting > at the end to an earlier offset.		19:03.21
artifexirc-bot	<ator> wouldn't that be better than filling the signature contensts with "garbage" at the end?		19:04.17
sebras	I'm inclined to agree, but @paulgardiner might not agree.		19:05.04
artifexirc-bot	<ator> I feel it's probably better to fix it when writing the object the first time		19:06.22
	<ator> by not encrypting it		19:06.26
	<paulgardiner> I'm not against moving the delimeter		19:06.35
	<ator> (or should it be encrypted?)		19:06.36
sebras	not it shouldn't be encrypted.		19:06.46
artifexirc-bot	<paulgardiner> I'm not sure why I didn't do that.		19:06.56
sebras	we know we're overwriting the digst anyhow.		19:06.59
artifexirc-bot	<ator> @sebras then I'd say fix it in writeobject :)		19:07.20
	<ator> but now, dinner and no more work for the day!		19:07.30
	<paulgardiner> I believe it's ASN1 and the reader wont read past the end		19:07.43
	<paulgardiner> A simple fix for now might be good to get the release out.		19:08.47
sebras	@paulgardiner if we're leaving a hole of unnecessary bytes, whether they are 00 or spaces between the necessary digest bytes and the area where second byte range covered by the digest I believe we might be opening up for the file to be structurally modified even though the digest is still matching.		19:11.57
	essentially we have bytes in the middle that can be anything without the digest failing.		19:12.40
	and those bytes could be a > to terminate the string and a >> to terminate the dictionary and possibly starting a new object.		19:13.13
	that would be a problem, right?		19:13.23
	@paulgardiner I did implement the zeroing solution just now, but I was unable to get acroread to open that file for me.		19:14.01
	could be because I'm running a stone age acroread, or the windows emulation layer or that this approach is not acceptable to acroread.		19:14.43
	more investigation tomorrow.		19:14.50
	<<<Back 1 day (to 2020/12/02)	Forward 1 day (to 2020/12/04)>>>

Log of #mupdf at irc.freenode.net.