MuPDF IRC logs

	<<<Back 1 day (to 2018/01/31)	20180201
paulgardiner	Anyone had a chance to review my signature support reorganisation?	11:35.34
tor8	paulgardiner: sorry, no. I'll look now.	12:22.41
paulgardiner	Ta	12:22.55
tor8	paulgardiner: the "SignatureError" type and enums aren't prefixed (and use weird camel case)	12:25.08
paulgardiner	No problem. What would you prefer?	12:25.46
tor8	Robin_Watts likes typedeffing enums, I'm not so sure I like that (I prefer an explicit "enum pdf_signature_error" when using the types to clarify that it is actually an enum	12:25.58
	and the verbose and uppercasey PDF_SIGNATURE_ERROR_OKAY, PDF_SIGNATURE_ERROR_NO_SIGNATURES, etc	12:26.15
Robin_Watts	Hmm. Not sure I'd characterise my thoughts entirely like that. Using #defines when you should be using enums is bad.	12:27.28
	constants should probably be all in upper case as tor8 says.	12:27.45
tor8	Robin_Watts: I mean you like "typedef enum { } my_enum_type;" and then "my_enum_type x;" where I prefer using explicit enum keyword "enum my_enum_type x"	12:28.15
Robin_Watts	And I guess I do prefer typedef enum blah, purely cos it means I don't have to type 'enum' everywhere else.	12:28.19
tor8	I find myself wondering whether a typedef is an enum or a struct...	12:28.45
Robin_Watts	tor8: shouldn't matter. It's a type.	12:28.57
	A type is a type is a type.	12:29.01
tor8	then again, in my non-mupdf code I generally don't even typedef structs :)	12:29.06
	Robin_Watts: I find knowing whether it's a composite type, a pointer type, or a value type to be useful	12:29.36
	I hate when people typedef away the * in a pointer type	12:29.45
Robin_Watts	tor8: Never typedef pointers.	12:29.48
tor8	and typedeffing away enum and struct is a lesser offense of the same nature	12:30.01
Robin_Watts	(except possibly function pointers)	12:30.33
tor8	now, if typedefs had actually been considered separate types, I would be of a different opinion, but when they're just unchecked aliases, bah	12:30.34
Robin_Watts	(actually, looks like we avoid typedeffing funttion pointers too. good)	12:31.31
tor8	yes, our function pointer typedefs don't include the star.	12:32.44
Robin_Watts	If you never typedef pointers, then it's easy to tell pointer types.	12:33.07
tor8	Robin_Watts: too bad freetype typedefs pointers	12:33.23
Robin_Watts	And composite types ARE value types.	12:33.23
	tor8: yeah :(	12:33.28
paulgardiner	I think this will have to be a fixup commit. I don't think it'll rebase back through the branch nicely. Besides that, no problem.	12:33.45
tor8	paulgardiner: a fixup commit is fine	12:33.59
	paulgardiner: my second question whether to change the namespace for the helper pkcs7 functions to 'mu' or 'fz' still stands	12:35.12
	ah, it takes a pdf_document so the 'pdf' namespace is probably good	12:35.41
paulgardiner	I'm easy. Whatever you prefer.	12:35.45
tor8	pkcs7_opensll_drop_designated_name shouldn't that be openssl in the middle?	12:36.20
sebras	three commits on sebras/master	12:36.20
paulgardiner	As a helper, I wonder whether to drop it completely and have pkcs7, but then I guess we're worried about symbol conflict	12:36.28
	Oh yeah: typo. Oops	12:36.53
tor8	you seem to have gone halfway there in pkcs7-openssl.h	12:38.00
	some have pdf_pkcs7_ others are pkcs7_	12:38.09
	I suspect we'll be fine with pkcs7_ only for the helper library	12:38.38
paulgardiner	File name and functions?	12:38.51
tor8	ah, I can't read properly. the type "pdf_pkcs7_signer" is the one with the prefix still. never mind my ramblings.	12:39.57
	in pdf_print_designated_name you use the sizeof(part)/sizeof(*part) idiom	12:40.28
	we have a macro for that: nelem()	12:40.32
paulgardiner	Oh yes. pdf_pkcs7_signer is defined in the main body of mupdf.	12:40.37
tor8	paulgardiner: yeah. I just realized that :)	12:40.47
paulgardiner	So that should keep the pdf_ prefix, but drop it for the functions of the helper?	12:41.29
	In pkcs7-check.c, I kept the pdf_ prefix on pdf_check_signature because that used to be a function of the mupdf API. Several apps call it.	12:46.00
sebras	tor8: Robin_Watts: a few commits on sebras/master up for review.	13:14.36
paulgardiner	tor8: commit pushed. I think I've addressed those points.	13:23.54
sebras	I have now found a peculiar source of recursion in mupdf.	14:19.21
	from pdf_init_document() I'm looking up object 1, according to the xref this object is placed in a object stream inside object 2	14:19.59
	object to has a well-defined location, but there is a /F entry that supposedly specifies what filters are needed to decompress the object stream inside of which I expect to find object 1	14:20.56
	now, this /F entry is indrect and points to object 1 0 R...	14:21.13
	cue the recursion and running out of stack! :)	14:21.29
tor8	paulgardiner: maybe we should have a null pdf_check_signature function that just throws "No digital signing support in this build" if built without openssl	14:22.31
	and we won't need the #ifdef HAVE_LIBCRYPTO in pdfsign.c (and supposedly elsewhere too)	14:22.44
	oh, and pdf_print_designated_name is probably better named pdf_format_designated_name	14:23.52
Robin_Watts	tor8: It would be nicest if the signature setting stuff sat behind an interface in the same way as the cmm stuff does.	14:24.58
	So people can fz_set_signature_engine(ctx, &engine);	14:25.22
tor8	Robin_Watts: depends on how forward facing the signature checking functions are	14:25.44
	the color management is pervasive and happens way down in the user-invisible guts	14:25.58
	if the signing functions are all called from user code, I don't see much benefit in stashing it in a global (our context)	14:26.25
	now I can see the value of having a pdf_check_signature(ctx, &engine, ...)	14:26.50
Robin_Watts	tor8: Depends if we end up with different implementations based on different libs.	14:27.25
tor8	which would mean the helper/pkcs7-check.c would be generic and use callbacks to the engine struct	14:27.30
	and we'd provide a null and an openssl-based engine	14:27.46
Robin_Watts	yes.	14:27.53
tor8	if we build with the helper library	14:27.58
Robin_Watts	If the helper lib is completely separable (i.e. the core doesn't depend on it) then I'd agree.	14:28.24
tor8	right now the pdf_check_signature function calls pkcs7_openssl_* directly	14:28.29
	we could rearrange that to go via callbacks	14:28.35
Robin_Watts	If we need to have #ifdef HAVE_LIBCRYPTO anywhere in the core, then that argues for putting it behind an interface now. IMAO.	14:28.54
tor8	but none of that should prevent the current code from going in; this stuff looks to be trivial to adapt to that shape	14:28.59
	Robin_Watts: this is why I suggested buliding a null-pdf_check_signatures so that we don't need the #ifdef in the caller code	14:29.37
Robin_Watts	That would be enough to make me happy.	14:29.50
tor8	and eventually if we add more implementations we can rearrange the shape of things to be callback-based on an engine struct	14:30.18
	maybe mupdf-java can use JNI calls to do the checking using whatever java provides	14:30.52
Robin_Watts	tor8: That seems like a laudable aim.	14:31.14
tor8	and iOS can use whatever securityframework bollocks apple is faffing about with	14:31.14
paulgardiner	Hmmm. It's shame to have a null pdf_check_signature as part of the mupdf library now that it's completely free of openssl. pdf_check_signature isn't actually needed at all any more because there are functions to get both digest and the stream of bytes. You just need those and something that can do pkcs7 verification. I kept pdf_check_signature just to help app writers with the transition.	14:37.54
	Added a pdf_check_signature helper, I should say. It's not part of the library any longer	14:38.53
	Maybe the helper itself could have two branches (with and without HAVE_LIBCRYPTO). So the apps would not need #ifdefs, but that helper would.	14:40.15
	tor8: come to think of it, perhaps that is what you were suggesting.	14:41.00
	Just read a bit of the above I missed. I considered making pdf_check_signature generic, but it's an unnecessary complication for the verification case, because all that is involved is getting the digest and stream and passing them to the engine. Adding a callback mechanism for that seems OTT	14:47.29
	I'd prefer pdf_check_signature to be example code really.	14:48.16
sebras	tor8: I need some guidance in how to resolve the recursion.	15:03.38
	tor8: I would have used pdf_mark_obj() as normal, but the thing is that the object I'm looking for doesn't exist yet.	15:04.07
	and when I'm opening the stream of the object stream I no longer know the object I'm looking for (the higher level functions know).	15:04.40
tor8	paulgardiner: yes, that's what I was suggesting (having the #ifdef in the helper)	15:12.33
paulgardiner	Yeah. That's much better. Just doing it now	15:12.56
tor8	paulgardiner: fair enough in terms of wanting to keep the pdf_check_signature helper as example code	15:13.00
	and as you say, it is a really simple function with just a lot of string formatting of error codes :)	15:13.40
	sebras: oh.... yeah.	15:49.49
	pdf_mark/unmark_obj is how we usually avoid cycles when loading stuff	15:50.03
sebras	tor8: never mind, I found a solution.	15:50.04
	I pdf_mark/unmark the object stream	15:50.14
	O	15:50.19
	I'm preparing a patch now.	15:50.23
tor8	sorry, your message scrolled off the screen so I missed it first time around	15:50.28
sebras	tor8: no worries.	15:50.45
	tor8: the commits are on sebras/master now.	15:54.09
tor8	sebras: "Bug 698908" could use a comment motivating the += 3 line	16:00.58
	"Bug 698830" has a couple of typos in the commit message, but looks good otherwise	16:02.56
	does " Skip objects inside object streams whose object ids are out of range." have a test file?	16:03.02
sebras	+= 3 I forgot (again)	16:11.35
	there is not test file, no. I created it from the one from 698830	16:12.04
tor8	sebras: okay, just curious (since it ought to be impossible)	16:12.37
	anyway, if you fix the comment and commit message all on sebras/master LGTM if it clusters well	16:12.55
sebras	tor8: why would it be impossible?	16:13.00
tor8	sebras: oh, there's a whitespace error too in that commit "found<space><space>=0"	16:17.18
	sebras: it would be impossible because I forgot the syntax for object streams :)	16:18.17
	they're actual arrays of object numbers, not base+count	16:18.27
sebras	yeah, that's true.	16:18.36
	base + count would have made more sense I guess.	16:18.45
tor8	that would be more like non-object streams	16:18.54
	but hey, it's a hacky addon by adobe. what's to be expected? :)	16:19.09
sebras	tor8: better?	16:21.28
	updated.	16:21.33
	it clustered well.	16:23.53
paulgardiner	tor8: I've added that null pdf_check_signature functions as suggested - a separate commit. I've also changed "print" to "format", rolled into the last of the existing commits.	16:37.26
sebras	tor8: what was the reason behind commit 7350d67f358d4f04643f43003861290ab162eaec ?	19:17.19
	I have a case where a document calls pdf_load_stream() with ref == an entry in a dictionary inside a normal object.	19:22.11
	now pdf_is_stream() will walk from the entry to the parent object and claim that this is a stream (it is).	19:22.38
	but pdf_get_indirect_document() and pdf_to_num() will return NULL and 0 respectively.	19:23.02
	eventually we try to use the document pointer which is null. :-/	19:23.39
	I'm sensing that the is an inconsistency between pdf_is_stream() and pdf_get_indirect_document/pdf_to_num() in how it takes the parent objects into account.	19:25.19
tor8	sebras: somewhere we had a resolved indirect object	21:40.05
sebras	tor8: you mean unresolved, right?	21:41.49
	tor8: I ended up with a minor patch to not dereference the document pointer if it is NULL, but that seems fairly fragile.	21:42.23
	tor8: I would like to avoid even calling stuff with doc == NULL.	21:42.42
	I wasn't aware that pdf_get_indirect_document() would return NULL if your object was something inside a dictionary/array/whatever.	21:43.21
tor8	no, I mean resolved.	22:13.11
	if you have the actual dict object, you have 'lost' the object number and can't find the corresponding stream	22:13.33
	(but then we added the 'pdf_obj_parent_num' field)	22:13.51
sebras	tor8: yes, I agree with that. but I don't understand why it was important to be able to get back to the object number and find the corresponding stream from a dict object entry.	22:41.02
	i.e. I understand how the meachnism works, but not why we want it.	22:41.16
tor8	sebras: some part of the code somewhere had a dereferenced dictionary and wanted to read the stream for it	23:44.00
	can't remember where or how that happened	23:44.05
	it might not be a problem anymore; I went and purged most of the explicit pdf_resolve_obj calls that we had lying around (we do autoresolving now, but we didn't use to)	23:44.38
	Forward 1 day (to 2018/02/02)>>>

Log of #mupdf at irc.freenode.net.