| <<<Back 1 day (to 2020/12/02) | Fwd 1 day (to 2020/12/04)>>> | 20201203 |
vtorri | hello | 10:47.15 |
mubot | Welcome to #mupdf, the channel for MuPDF. If you have a question, please ask it, don't ask to ask it. Do be prepared to wait for a reply as devs will check the logs and reply when they come on line. | 10:47.15 |
vtorri | does a pdf have a backrgound color ? and if yes, is it possible to get it with mupdf ? | 10:47.50 |
artifexirc-bot | <KenSharp> Only if hte PDF file starts by colouring in the whole page | 10:48.16 |
ator | vtorri: no, there's no such concept in PDF. | 10:48.18 |
vtorri | thank you | 10:48.26 |
artifexirc-bot | <KenSharp> For transparency calculations the page is assumed to be white | 10:48.27 |
| <Robin_Watts> Urm... no. | 11:48.24 |
| <Robin_Watts> For transparency calculations the page is assumed to have zero alpha 🙂 | 11:49.08 |
| <Robin_Watts> @ator ping! | 11:49.14 |
| <Robin_Watts> I'm looking into why various annotation types aren't appearing after editing. | 11:49.36 |
| <Robin_Watts> When we do pdf_update_appearance, we write the updated stuff into "local" xrefs unless the object is marked as dirty. | 11:50.18 |
| <ator> @Robin_Watts so many incomplete edit operations that throw "Can't alter..." exceptions it's hard to try different things | 11:51.32 |
| <Robin_Watts> And we don't set the object to be dirty on editing. | 11:52.00 |
| <ator> we don't set 'dirty' we set annot->needs_new_ap | 11:52.24 |
| <ator> dirty is a lower level thing for pdfwrite | 11:52.37 |
| <Robin_Watts> We call "pdf_dirty_annot" | 11:53.04 |
| <Robin_Watts> and that, as you say, sets needs_new_ap. | 11:53.20 |
| <Robin_Watts> and it dirties the doc. | 11:53.24 |
| <ator> yup. and that doesn't set the dirty flag (I was just checking this function) | 11:53.27 |
| <Robin_Watts> but it doesn't dirty the annot->obj | 11:53.38 |
| <ator> the annot->obj dirty flag is used to detect JS changes | 11:54.06 |
| <Robin_Watts> We really need clear definitions in the code of what these different dirty flags do. | 11:54.45 |
| <Robin_Watts> cos I know I'll never remember precisely. | 11:54.53 |
| <ator> I'm in complete agreement there! | 11:55.03 |
| <ator> see tor/undo for a tweak that fixes the ink stuff | 11:56.05 |
| <Robin_Watts> Can we make annot->obj be set to be dirty when edited? | 11:56.07 |
| <Robin_Watts> Ah, ok. | 11:56.56 |
| <ator> I don't remember all the details, but that didn't work last time (which is why I had to add a needs_new_ap flag) | 11:57.01 |
| <Robin_Watts> OK. | 11:57.05 |
| <ator> I think we mixed up all kinds of things in the obj->dirty flag and that didn't work out too well | 11:57.25 |
| <ator> too many things were setting it and it was not good if we clear it too early | 11:57.55 |
| <ator> synthesizing the appearance would set it again, etc. | 11:58.09 |
| <ator> just general fragility | 11:58.19 |
| <Robin_Watts> right. that seems to work for me. | 11:58.27 |
| <ator> the comment should be moved right along with the change | 11:58.49 |
| <ator> since we need an explicit start operation for every change, I think we're going to have a big job of adding those to every annot and doc editing/property changing function | 11:59.34 |
| <Robin_Watts> Yes. | 11:59.55 |
| <ator> or we're gonna get random "Sorry, you can't do that Dave!" popping up every now and then :D | 12:00.06 |
| <Robin_Watts> The other problem this throws up is pdf_get_annotation_obj | 12:00.32 |
| <ator> I do like making the undo operations be automatic, and I see now why you make them nested because of the 'create new annot' and maybe later 'copy & paste' of annots changing multiple properties that should just be one undo step | 12:01.04 |
| <Robin_Watts> With the local xref in, if we're going to fiddle with an annotation, we need to ensure the local xref is pushed/popped around such fiddling. | 12:01.07 |
| <Robin_Watts> And that goes out the window if we let callers access the annotation object. | 12:01.42 |
| <ator> Why? Do we want to use the local xref for more than just the appearance synthesis when it's missing case? | 12:01.53 |
| <Robin_Watts> No, we only want to use it in that case. | 12:02.16 |
| <Robin_Watts> but if an app can say "give me the object for this annotation", and that annotation happens to be one with a local xref in force, then the object they get back will be problematic unless that local_xref is pushed at the time they try to access it. | 12:02.57 |
| <ator> oh yeah, because we update the Rect as well as the AP | 12:03.51 |
| <ator> pdf_obj_ref has a 'doc' entry | 12:04.38 |
| <Robin_Watts> ANYTHING accessed through that obj will end up in the local xref. | 12:04.51 |
| <ator> but we can't really make that hold the local_xref too trivially | 12:05.16 |
| <Robin_Watts> (I can't remember WHY we added the facility to get object) | 12:05.41 |
| <Robin_Watts> It's something we did since 1.18 | 12:05.52 |
| <ator> you mean for the java bindings? | 12:06.00 |
| <Robin_Watts> either for the java bindings or for C. | 12:06.16 |
| <ator> the C stuff uses the 'obj' internally and everywhere basically for the annot property accessors | 12:07.01 |
| <ator> we can make those accessors behave properly with the local_xref | 12:07.12 |
| <ator> but raw use of the annot->obj could be problematic if the user wants to resolve a reference to a local object | 12:07.43 |
| <Robin_Watts> Right. | 12:08.13 |
| <ator> I'm trying to think of a way we can associate the local_xref with the dictionary or array or indirect reference so that resolving would be automatic without needing to push | 12:08.25 |
| <Robin_Watts> I don't mind us internally using annot->obj | 12:08.31 |
| <Robin_Watts> cos we can allow for that in the code. | 12:08.38 |
| <ator> we have a 'doc' reference to them, which is used to mark the document as dirty and locate stream data | 12:08.46 |
| <Robin_Watts> I do mind there being an external API for people to use to get it. | 12:08.50 |
| <ator> I think the SO folks asked for it, but I can't remember exactly why. maybe @sebras knows? | 12:09.35 |
| <Robin_Watts> I'll hunt back through git in a bit. | 12:09.49 |
| <ator> if it's just to have a unique identifier, we could return the object number as an int/ID instead | 12:09.53 |
| <Robin_Watts> I'm slightly confused by your patch... | 12:10.03 |
| <ator> if that's all they use it for | 12:10.05 |
| <Robin_Watts> Now, if needs_new_ap is clear, but js has altered stuff we no longer set local_appearance to be 0. | 12:10.47 |
| <ator> I can't find any callers of getObject in any of our projects | 12:10.51 |
| <ator> if JS has altered stuff, then we've triggered changes in the document which should be reflected | 12:11.16 |
| <ator> if so, then we should set the local_appearance to 1 as well yeah | 12:11.44 |
| <Robin_Watts> to 0. | 12:11.56 |
| <ator> yes, right. | 12:12.08 |
| <Robin_Watts> So I'll change the ordering of those if's. | 12:12.20 |
| <ator> yeah. that should work. | 12:12.31 |
| <Robin_Watts> Returning the object num is not so good. | 12:13.21 |
| <Robin_Watts> If an object moves to the local_xref the number will change. | 12:13.47 |
| <Robin_Watts> And if 2 annots both have local_xref objects, they'll have the same number. | 12:13.59 |
| <ator> the annot dictionary should never live in the local_xref | 12:14.24 |
| <ator> only stuff it holds (such as the AP and Rect properties) | 12:14.33 |
| <ator> or do we move the entire annot to the local_xref or make a copy of it there so we can change it? I'm confusing myself. | 12:15.25 |
| <Robin_Watts> The latter, I think. | 12:15.55 |
| <ator> we push_local_xref when updating the appearance | 12:16.12 |
| <ator> that means any new objects or edits to the current object move it into a new incremental section right? | 12:16.27 |
| <ator> that means any new objects or edits to existing objects move it into a new incremental section right? | 12:16.39 |
| <Robin_Watts> <thinking> | 12:17.12 |
| <ator> if so, any accesses to the annot properties (or at least the ones affected by appearance synthesis, such as Rect and Matrix and AP) should be preceded with a push_local_xref | 12:17.37 |
| <Robin_Watts> Yes. | 12:18.07 |
| <Robin_Watts> I attempt to do that. | 12:18.13 |
| <Robin_Watts> (whenever we run_annot, for instance) | 12:18.20 |
| <Robin_Watts> While we have a local_xref, any new objects certainly go into it. | 12:20.14 |
| <Robin_Watts> When we go to edit any object, we call prepare_object_for_alteration. | 12:20.48 |
| <Robin_Watts> That previously ensured that the whole object was in the latest xref. | 12:21.01 |
| <Robin_Watts> That previously ensured that the whole object was in the latest xref, by calling pdf_xref_ensure_incremental_object | 12:21.17 |
| <Robin_Watts> I'm thinking that if we have a local_xref, we want the whole object to be in the local_xref, NOT in the incremental_xref. | 12:22.28 |
| <Robin_Watts> but I can't see where that is achieved at the moment (maybe it isn't, and I should fix that) | 12:22.43 |
| <ator> When we manually edit the annot->obj by setting a property, we want to move the whole object out of the local_xref. | 12:27.57 |
| <Robin_Watts> We do. | 12:28.16 |
| <Robin_Watts> and I think prepare_object_for_alteration is the right place to do that. | 12:28.49 |
| <ator> Maybe we just shouldn't worry about it too much, the appearance related stuff should be regenerated in the global xref at the next appearance update | 12:29.02 |
| <Robin_Watts> We do want to do that, yes. | 12:29.05 |
| <ator> I'm thinking about the case when we have an annotation without an AP, we generate a local AP for it, then after the user edits it, we want to generate a global AP | 12:29.44 |
| <Robin_Watts> In that case, we discard the local_xref and build the global one, I believe. | 12:30.17 |
| <ator> shouldn't there be a delete_local_xref in the else to if (local_appearance) in pdf_update_appearance? | 12:55.24 |
| <ator> why do you guard the dict_put and update_xobject with if (!local_appearance)? | 12:56.24 |
| <Robin_Watts> ator: Yes, there should. It's in the version I'm editing at the moment. | 12:56.49 |
| <Robin_Watts> let me look. | 12:57.09 |
| <Robin_Watts> I'm sure that made sense at the time. Probably not now that I'm deep copying the object. | 12:58.25 |
| <Robin_Watts> Give me a bit to bash on this and prepare you a new version. | 12:58.41 |
| <ator> Ok! | 12:58.50 |
clam | also bashed (yesterday) - https://repo.or.cz/llpp.git/blob/master:/ahbs - beauty that is fully functional and useful | 13:18.13 |
malc_ | ator: after getting rid of "native" target in llpp (https://repo.or.cz/llpp.git/commit/fb9db17eabfd561ad4e39180d4c27754d6beb06a) the question arose what is the build=native purported Raison d’Être in mupdf? | 13:46.32 |
malc_ | and now that i have shown off my awesome french skillz, it's time to go outside and buy some candy... brb | 13:47.39 |
malc_ | well, if not an answer, i now have, sweets | 14:50.04 |
artifexirc-bot | <Robin_Watts> @ator we can try/catch within an always block without buggering up any existing error, can't we? | 15:02.06 |
| <Robin_Watts> yeah. phew. | 15:03.57 |
ator | we clobber the error code and message though | 15:21.09 |
| see commit 8d6d78246 which adds a warning when that happens | 15:21.23 |
malc_ | ator: anything? | 15:28.30 |
ator | malc_: it's been there for as long as I can remember | 15:37.54 |
| it may have been for your benefit? | 15:38.37 |
malc_ | ator: my? as in | 15:39.45 |
| let m | 15:39.59 |
| damn HHKB i keep missing keys | 15:40.08 |
| let me brush up my git archaeological skills | 15:40.42 |
| ator: i disclaim all responsibility, it was Barzini(you) all along 42a328bc17dbbfa1f53e069e77b5b9fec793a32a | 15:43.56 |
ator | yes, but *why* did I do it? :) | 15:44.39 |
malc_ | do *you* realize that asking random ruskies on the internet will not shed any light on this mystery :) | 15:46.17 |
ator | it was a rhetorical question! | 15:46.29 |
| no answer required, nor expected | 15:46.40 |
| the mystery will remain mysterious | 15:46.46 |
malc_ | rhetorical questions along with sarcasm rarely survive IRC | 15:47.07 |
| but, blaming me! | 15:47.36 |
| that's low | 15:47.43 |
| ator: you probably have mistaken me for polish Sumatra authors or something | 15:49.32 |
ator | or maybe it was an experiment that lived too long | 15:55.34 |
malc_ | in which case the blame put on me had no excuse whatsoever | 15:56.28 |
ator | you were using it, so I haven't purged it! | 15:56.41 |
malc_ | touche | 15:57.51 |
malc_ | gee.. do i suddenly feel important... | 15:58.28 |
sebras | when we're writing an encrypted PDF we used to have some heuristics or something somewhere to skip over certain objects. did this include the signature object itself? or what? | 17:34.12 |
ator | sebras: the num == opts->crypt_object_number argument to writeobject in dowriteobject in pdf-write.c | 17:58.04 |
sebras | @ator ah! | 18:00.50 |
| @ator I found something interesting. | 18:01.08 |
| I added signature creation to debug pauls issue. | 18:02.21 |
| signature creation in mupdf-gl. | 18:04.05 |
| when I'm using that to add a new signature widget tht works fine. | 18:04.15 |
| at that point the signatture dictionary contains no value. | 18:04.41 |
| later one I click the widget an ign it with a certificate I generated myself. | 18:04.59 |
| at this point the signature dictionray has a value added where wher have /Contents<00000000000...> | 18:05.28 |
| thats ok because when we save the file later we're computing the digest and patch it in into /contents. | 18:07.06 |
| but first we need to save all the objects. when we do so we're encrypting the 000000000000.... value from above, even though we know we will later overwrite this. | 18:08.08 |
| and the encrypted /Contents is longer than the original 000000... value. | 18:09.05 |
| I'm surprised that the encryption can _extend_ strings..? | 18:15.18 |
| oh, it might be the initia random string when doing AES encryption.... | 18:17.36 |
artifexirc-bot | <ator> yeah, the block headers in AES encryption | 18:47.33 |
sebras | no the iv string. | 18:49.50 |
| @ator this caused a later problem because when we complete the signatures I compared the length of /Contents in memory with the length of the string in the saved file. | 18:50.45 |
| and of course they differed. | 18:50.48 |
| because of the iv string. | 18:50.52 |
| so now I'm updating pdf_write_digest() such that it writes 0x00 over any bytes in /Contents<string> not occupied by the digest itself. | 18:51.31 |
| if we were to not encrypt /Content<string> for signature objects that might be another approach. | 18:52.02 |
| I'm not sure whether that is correct or not yet. | 18:52.15 |
artifexirc-bot | <ator> I don't know the right solution either. will padding with 0 be correct? I would have assumed we should close the string and scan for > (end of string) and overwrite with space or something | 19:01.34 |
sebras | that's another option. | 19:02.55 |
artifexirc-bot | <ator> we could also do somethnig similar to is_xml_metadata to detect signature objects and not encrypt the Contents key | 19:03.11 |
sebras | but in that case we need to move the delimiting > at the end to an earlier offset. | 19:03.21 |
artifexirc-bot | <ator> wouldn't that be better than filling the signature contensts with "garbage" at the end? | 19:04.17 |
sebras | I'm inclined to agree, but @paulgardiner might not agree. | 19:05.04 |
artifexirc-bot | <ator> I feel it's probably better to fix it when writing the object the first time | 19:06.22 |
| <ator> by not encrypting it | 19:06.26 |
| <paulgardiner> I'm not against moving the delimeter | 19:06.35 |
| <ator> (or should it be encrypted?) | 19:06.36 |
sebras | not it shouldn't be encrypted. | 19:06.46 |
artifexirc-bot | <paulgardiner> I'm not sure why I didn't do that. | 19:06.56 |
sebras | we know we're overwriting the digst anyhow. | 19:06.59 |
artifexirc-bot | <ator> @sebras then I'd say fix it in writeobject :) | 19:07.20 |
| <ator> but now, dinner and no more work for the day! | 19:07.30 |
| <paulgardiner> I believe it's ASN1 and the reader wont read past the end | 19:07.43 |
| <paulgardiner> A simple fix for now might be good to get the release out. | 19:08.47 |
sebras | @paulgardiner if we're leaving a hole of unnecessary bytes, whether they are 00 or spaces between the necessary digest bytes and the area where second byte range covered by the digest I believe we might be opening up for the file to be structurally modified even though the digest is still matching. | 19:11.57 |
| essentially we have bytes in the middle that can be anything without the digest failing. | 19:12.40 |
| and those bytes could be a > to terminate the string and a >> to terminate the dictionary and possibly starting a new object. | 19:13.13 |
| that would be a problem, right? | 19:13.23 |
| @paulgardiner I did implement the zeroing solution just now, but I was unable to get acroread to open that file for me. | 19:14.01 |
| could be because I'm running a stone age acroread, or the windows emulation layer or that this approach is not acceptable to acroread. | 19:14.43 |
| more investigation tomorrow. | 19:14.50 |
| <<<Back 1 day (to 2020/12/02) | Forward 1 day (to 2020/12/04)>>> | |