| <<<Back 1 day (to 2018/07/10) | 20180711 |
paulgardiner | tor8: seeing just a couple of remaining problems with the forms/annotations rework. I'm not seeing display updates (has_new_ap flag) when signing a document, and bug 698994 is still apparent. | 09:59.46 |
| I like the new signature ap icon, btw | 10:00.12 |
tor8 | paulgardiner: the flower sigil thing? :) | 10:00.42 |
paulgardiner | Yeah | 10:04.24 |
tor8 | I found a nice symbol in the Zapf Dingbats font, that was nice and easy. | 10:08.25 |
paulgardiner | Oh I see. Smart | 10:17.21 |
tor8 | I'm seeing updates when signing the smpapp2009-test.pdf document | 10:19.25 |
paulgardiner | re, what we were discussing yesterday: I realised that when verifying signatures, we have to read the whole file and send it to whatever cryptographic library is in use, so reading the whole file a second time wouldn't necessarily be out of the question, in which case, we could locate the startxref <offset> %%EOF corresponding to any xref section just by scanning. | 10:20.27 |
tor8 | paulgardiner: but you want smart scanning, in case you find "%%EOF" inside stream contents | 10:25.02 |
paulgardiner | Urg! | 10:26.03 |
| I guess so | 10:26.08 |
tor8 | paulgardiner: I'm going to look into the rotated annotations next. | 10:26.33 |
| I think it's because we don't respect the 'NoRotate' flag | 10:26.55 |
paulgardiner | An incorrect match for the whole string startxref <offset> %%EOF with the correct offset would be unlikely, but I guess possible | 10:27.03 |
tor8 | (nor do we set it when creating certain annotation types) | 10:27.05 |
| paulgardiner: a maliciously constructed file could have them; very unlikely in the case of "good" files. | 10:27.40 |
paulgardiner | Let me mull that over. I'm not sure any security breakage can be achieved that way. | 10:29.41 |
tor8 | paulgardiner: can we use the ByteRange in each xref section to guide the search? | 10:31.00 |
paulgardiner | or at least, if it can, then so can just adding an extra startxref <offset> %%EOF to the file outside of any object. | 10:31.05 |
tor8 | paulgardiner: yeah, I guess that's true too | 10:31.19 |
paulgardiner | Oh silly me. Of course we can use the byte range. | 10:31.42 |
| I'm just trying to check the byterange, so I can just walk back from the end | 10:32.06 |
| Thanks tor8. Now you've pointed that out, it seems totally obvious and I don't know how I missed it. | 10:32.56 |
| I'm still not sure what can and cannot be achieved maliciously and how to spot that. | 10:34.57 |
| I suppose what we should really be checking is that, for the xref section containing the signature, all other objects it references are within the byte range. | 10:37.36 |
| ... plus the xref itself is in the byte range | 10:38.12 |
| ... and the trailer. | 10:38.26 |
| Is that the same as "all objects start in the byte range and the one with the highest offset also ends in the byte range? | 10:40.30 |
tor8 | paulgardiner: where it ends could be a problem... | 10:40.57 |
paulgardiner | ... or can you maliciously create objects that overlap | 10:41.04 |
tor8 | you could create an object with an offset that's inside the content stream of another | 10:41.25 |
paulgardiner | So we'd need to check start and end of every object | 10:41.55 |
| That may be not that much overhead above having to send the whole file to the cryptographic library. | 10:42.49 |
| objects do have a length? You don't have to parse them to find the length. | 10:43.16 |
tor8 | paulgardiner: are you trying to verify that an old signed version's byteranges cover the wohle file? | 10:43.33 |
| paulgardiner: they do not, they only have a start offset | 10:43.45 |
| if you take the min and max of the byterange ranges, and make a new PDF from those bytes, you could verify that? | 10:44.19 |
Robin_Watts | Isn't the condition "the byteranges have to cover the entire file except for the range where the checksum is written" ? | 10:44.54 |
tor8 | having signed xref sections that point outside the file, well, they're still signed and you still have to trust the signer in the first place | 10:45.12 |
paulgardiner | That is the condition, but the file may have grown since the signing. | 10:45.24 |
tor8 | paulgardiner: could you restate the nature of your problem, what are you trying to accomplish with this? | 10:46.37 |
| is it the "this signature matches an old version" vs "this signature matches the entire file as of now" | 10:47.26 |
paulgardiner | Depends what you mean by matches | 10:47.57 |
tor8 | I guess that's what I'm asking :) | 10:48.08 |
paulgardiner | We currently can check whether the digest for a signature matches the hash of the bytes from the byte range. We can do that whether or not the signature was added in the most recent incremental update or not. Further, I wish to check that the byterange was not constructed maliciously not to cover all the data it should have. | 10:49.57 |
| In some way, I'm not sure why this is needed, but AR seems to be doing it. | 10:50.23 |
Robin_Watts | Suppose I have a PDF with a contract in it. | 10:50.47 |
tor8 | paulgardiner: I don't think you can do that reliably for anything but the latest version, not without making a copy of the byteranges and checking that file for validity | 10:50.49 |
| and checking for validity is something we're intentionally very lax about, wanting to cope with lots and lots of broken files | 10:51.02 |
paulgardiner | If the users ensure that they only use their signatures to sign documents with trusted software, they should be fine. | 10:51.03 |
Robin_Watts | I could "sign" that in way that actually just guaranteed that the top 100 bytes of the file weren't changed. | 10:51.09 |
| Then I could go around changing the contents after the first 100 bytes (hence the actual contents of the contract), and the file would still appear to be signed. | 10:51.46 |
paulgardiner | Robin_Watts: yes. It would rely on someone exposing their key to malicious software though. | 10:52.14 |
tor8 | paulgardiner: I think saying "this used to match, do you want to view what was actually signed?" and saving a copy using the byteranges that were signed should be good enough | 10:52.37 |
Robin_Watts | But I guess the intent is that when *other* people sign a document, what *they* sign can't be changed, so they'd need to be signing with duff software. | 10:52.42 |
paulgardiner | tor8: this is independent of showing the user the old version of the doc. | 10:53.02 |
| Robin_Watts: yes duff or malicious software. | 10:53.29 |
Robin_Watts | paulgardiner: How do you know AR is doing it ? | 10:53.37 |
paulgardiner | Robin_Watts: AR complains about the byte range of a document signed by mupdf... seemingly for no good reason as it happens. | 10:54.13 |
| tor8: we can already validate old signatures in terms of cryptographic correctness. I guess in a way we are copying the byte range in that we are piping the bytes to the crypto lib | 10:55.15 |
tor8 | paulgardiner: I was just thinknig of showing it without any subsequent edits, since that would be very useful when you want to be sure to view what was actually signed | 10:56.09 |
paulgardiner | tor8: I don't see why cannot also check that everything referred to by the xref section in question lies within the byterange, albeit perhaps only in an inefficient way | 10:56.15 |
Robin_Watts | It seems to me that if I sign a document A.pdf, to get A_signed.pdf, and then someone extends that document to get B.pdf, and then checks to see if B.pdf is signed, the answer should be "no, but a previous version was"? | 10:56.15 |
paulgardiner | tor8: yes that would be useful, but not what I'm looking at just now. | 10:56.36 |
tor8 | "no, but a previous version was. would you like to see it?" would be the useful question. | 10:56.47 |
| paulgardiner: then what are you looking at just now? I've confused myself enough to have lost track of the original question. | 10:57.18 |
| (or you can see my answer posted at 11:50 your time zone) | 10:58.19 |
Robin_Watts | The problem with what I've just described though is that if person A signs A.pdf, to get A_signed.pdf, and then person B signs it too to get A_signed2.pdf. when you check signatures you'd be told that person A signed an old version, which isn't helpful. | 10:58.21 |
paulgardiner | Robin_Watts: yes, that is another thing that we need to show is that the signature relates to an early version of the document... although we may want to special case when the only changes are addition of more signatures | 10:58.34 |
| I have a grocery delivery to put away. biab | 10:59.32 |
| tor8: my post 10:49.57 states what I'm trying to achieve. | 11:10.13 |
tor8 | paulgardiner: right. I think the easiest way to do that is to check that the %%EOF that's actually used for that version is at the end of the byterange (and the file has no more at the end) | 11:14.25 |
| the only way to do that now, I think, is to save a copy of the file from the byteranges and open that and verify it there | 11:14.48 |
| (in order to check anything other than the most recent version, that is) | 11:15.06 |
| since reopening from the byteranges will scan the old trailer and startxref, etc. and if maliciously created, then that should fail. | 11:15.40 |
| though I suppose it's possible to create a malicious file that has object xref offsets outside the file (i.e. it's broken but anyone will just ignore the bad bits), that can be signed, and then when viewed with an incremental update later will now have the data for those objects | 11:17.07 |
| which could change the appearance, and still be valid when signed | 11:17.28 |
| this is just such a security nightmare. you really need to trust every step in the chain due to how flexible PDF is... | 11:18.04 |
paulgardiner | Yes. That is why I was proposing checking that the objects lie within the byterange | 11:18.13 |
tor8 | they should've just stuck to PGP signing the whole file, *outside* the file | 11:18.49 |
| embedding signatures in the file itself, with an editable file format, that edits itself when signing ... that's just crazy design. | 11:19.09 |
| but it's too late for that now... we have to live with adobe's mess. | 11:19.31 |
| paulgardiner: it might be enough to check that the xref and the object offsets lie within the byterange | 11:20.28 |
paulgardiner | Yeah, that's what I was suggesting | 11:20.41 |
tor8 | and bugger them to hell if they expect anything *actually* secure from PDF :) | 11:20.57 |
| I mean, you gotta draw the line of trust somewhere. if you trust their signature, do you also trust that they used decent software and are not trying to trick you? | 11:21.35 |
| I mean, you trust their signature in the first place... | 11:21.45 |
| we're hopefully only trying to detect man in the middle attacks, not malicious first parties. | 11:22.12 |
paulgardiner | ... but then, as you mentioned I think, an object may lie within the byterange, but have an indirect reference to an invalidly large numbered object that gets filled in by an addition to the file | 11:22.49 |
tor8 | paulgardiner: that too. | 11:23.04 |
| there's just so much flexibility to do crooked things if you're the author | 11:23.24 |
paulgardiner | I guess we could spot that. When viewing normally, we could disallow use of indirect references outside the xref for the object in question. | 11:24.22 |
tor8 | paulgardiner: disallow how? you mean for viewing an old version? | 11:25.01 |
paulgardiner | No, I mean in normal use | 11:25.14 |
tor8 | then I don't understand... if it's outside the xref we can't find it (and it resolves to null) | 11:25.55 |
paulgardiner | Whenever we look up an indirect reference, we start the search down our array of xref sections from the one that defined the object | 11:25.58 |
tor8 | how would 'disallowing' differ? | 11:26.02 |
paulgardiner | At the moment we search for indirect objects within the whole list of xrefs, independently of where in the hierarchy the object we are processing occured | 11:28.01 |
| the containing object I mean | 11:28.19 |
| At verification time we could check that the set of objects defined by the xref is closed under reference, but that would require parsing all of them | 11:30.03 |
| In the absence of that, I'm suggesting that the viewer could refuse to display any objects not satisfying the closure condition. | 11:30.52 |
| Perhaps it's reasonable to expect users not to expose their signing keys to malicious software, but now I'm worried that we might be open to problems with maliciously created files pre-signing. | 11:36.33 |
tor8 | paulgardiner: "refuse to display" ... then you must be talking about showing older versions, otherwise there's nothing to refuse to display because it doesn't exist in the file, right? | 11:37.45 |
| does acrobat allow you to sign a maliciously created file, with invalid indirect references? | 11:38.53 |
| because I think that's the problem we might be trying to solve here -- allowing a malicious file to be unwittingly signed | 11:39.36 |
| the signing code writes the byterange arrays to cover the whole file at the point of signing | 11:39.55 |
| IIRC | 11:39.59 |
paulgardiner | tor8: not talking about display older versions. Talking about display the current version, but parts of it that were defined as part of an earlier version | 11:41.44 |
| I'm not sure where the "ing"s went in that sentence | 11:42.15 |
tor8 | oh right! I see what you mean. detecting invalid forward references. | 11:42.21 |
paulgardiner | tor8: yeah. That's a much better term for it | 11:42.37 |
tor8 | that would still mean we need to detect the %%EOF correctly for all versions :( | 11:43.01 |
paulgardiner | I don't think that's true. The type of forward we need to disallow is reference to an object numbered above the end of the xref. I don't think it matters where in the file because we check that at verify time. | 11:45.35 |
tor8 | paulgardiner: there are two possible problems to look for. | 11:46.50 |
| one is as you say, the indirect reference object number itself | 11:47.02 |
| the second is where the xref has an offset that points beyond the (previous) %%EOF | 11:47.19 |
| the object number is valid, but it exists beyond the end of the original file | 11:47.39 |
paulgardiner | But that second one we have been discussing checking at verify time. | 11:47.52 |
tor8 | and then a future update adds in the object data at the correct file offset | 11:47.58 |
paulgardiner | Checking it is within the byterange that is | 11:48.33 |
tor8 | personally, this whole idea of verifying stuff worries me | 11:49.05 |
| there will be cases we don't cover... I'd be more comfortable saying "we don't do that at all" than try to instill some sense of safety that just isn't there | 11:49.26 |
paulgardiner | But I think we may be getting close to something feasible. | 11:49.49 |
| ... not that we necessarily chose to do it even if it is, but still... | 11:50.13 |
| At verify time, for a signature referenced from a specific xref, we check that the byterange contains the xref itself and all the objects to which it refers (recursively down the prev list too). And while viewing we avoid displaying forward references in terms of object number. | 11:52.17 |
fredross-perry | tor8, sebras - I am working on displaying alert messages. When I try to create a Java string from the alert message using (*env)->NewStringUTF(env, alert->message), it crashes, telling me "input is not valid Modified UTF-8". The text of the message looks like this: | 18:36.19 |
| The date/time entered (jjjjjj) does not match the format (mm/dd/yyyy) of the field [ BIRTHMMDDYYYY\x03\x03\x035��\x8a��#\xb2\xae�Ï\x91\x01,O ] | 18:36.19 |
| BIRTHMMDDYYYY... is coming from event.target.name, seems like, looking at util.js. | 18:36.20 |
| Not sure if the document's at fault, or what. | 18:36.21 |
| maybe a string is not getting terminated properly somewhere along the line. | 18:36.56 |
| The field name coming from pdf_field_name() is broken. If I run the doc through 'mutool clean' so I can see things uncompressed, the name looks ok and also does not cause the error. So I'm suspecting the document, or our handling of the field. when the doc is loaded. | 19:36.24 |
| Forward 1 day (to 2018/07/12)>>> | |