| <<<Back 1 day (to 2017/09/06) | 20170907 |
chrisl | Someone should probably comment on this: https://github.com/ArtifexSoftware/thirdparty-glfw/issues/1 | 09:12.27 |
tor8 | Robin_Watts: one commit on tor/master to allow for the fz_context struct to be opaque in the try/catch macros | 12:34.31 |
| (and it also reduces the code size) | 12:34.36 |
Robin_Watts | tor8: That's 2 function calls, rather than 1. | 13:31.44 |
| How about... | 13:32.00 |
| jmpbuf_t *buf = fz_push_try(ctx); if (buf) { if (fz_setjmp(buf) == 0) do \ | 13:33.06 |
| i.e. we make fz_push_try return NULL or ctx->error->top->buffer rather than 0 or 1 ? | 13:33.37 |
| That gets us opaqueness, and still keeps the number of function calls down. | 13:33.58 |
| And we can use void * if required cos of the nasty types. | 13:34.27 |
tor8 | Robin_Watts: yeah, that would probably work too | 13:35.01 |
| might get warnings about shadowed variables if we nest try-blocks but I think we try to avoid that | 13:35.21 |
Robin_Watts | tor8: We can possibly use buf_##_LINE_ or something as the variable name if we want to. | 13:36.02 |
| OR... | 13:37.13 |
| if (fz_setjmp(fz_push_try(ctx)) == 0) do \ | 13:37.52 |
| no, that doesn't help. | 13:38.18 |
tor8 | checking for overflow after the setjmp doesn't help in terms of number of function calls | 13:38.35 |
| the only thing that would help is throwing a real before fz_try (so we bail to the surrounding fz_try if fz_try itself would fail, rather than to the catch of the try that failed) | 13:39.09 |
Robin_Watts | #define fz_try(ctx) fz_try_var(ctx, buf_##_LINE_) or something. | 13:39.23 |
tor8 | but that would mean going over all our code to make sure it works with that | 13:39.24 |
Robin_Watts | Having fz_push_try return the buffer pointer seems like a fairly small, safe change. WhatCouldPossiblyGoWrong(TM) | 13:40.33 |
tor8 | Robin_Watts: not sure what you mean there | 13:44.59 |
| Robin_Watts: do we support 'break' inside the fz_always block? | 13:46.54 |
| or rather, do we want to | 13:47.03 |
Robin_Watts | no. | 13:47.05 |
| do we want to? dunno. | 13:47.13 |
| We do support it I think. | 13:47.31 |
tor8 | bah, we still need to in order to make fz_always optional | 13:47.32 |
Robin_Watts | We have to... yeah. | 13:47.36 |
tor8 | because catch starts with the 'while (0)' part to allow for fz_try to have a 'do' | 13:47.51 |
| #define fz_try(ctx) { void *fz_env_##__LINE__ = fz_do_try(ctx); if (fz_env_##__LINE__) if (!fz_setjmp(fz_env_##__LINE__)) do | 13:48.00 |
| #define fz_always(ctx) while (0); } if (fz_do_always(ctx)) { do | 13:48.00 |
| #define fz_catch(ctx) while (0); } if (fz_do_catch(ctx)) | 13:48.00 |
Robin_Watts | yeah. | 13:54.09 |
| Having fz_try return a BUF pointer does not avoid us needing to see inside ctx in the catch clause. | 13:59.31 |
| What is the motivation behind this? | 13:59.44 |
tor8 | Robin_Watts: uhm, fz_do_catch hides the peeking... | 14:00.16 |
| to make fz_context insides NOT part of the public API so we can hide that in the internal headers | 14:00.38 |
Robin_Watts | Your version that calls more functions does indeed solve the problem. | 14:00.49 |
| I dislike calling more functions. | 14:00.56 |
| function call overheads aren't huge, but they are measurable. | 14:01.18 |
tor8 | and also to reduce code size. a function call is smaller than the dozen or so pointer chases that the old code needed | 14:01.38 |
| setjmp is overhead enough that nobody ought to use try in an inner loop | 14:02.06 |
| I did experiment with having a local 'int code' that the ctx would store and save us having to call functions | 14:02.43 |
| but then we can't get the nesting of the braces right so that catch can see the local variable | 14:03.00 |
Robin_Watts | yeah. | 14:05.05 |
tor8 | Robin_Watts: updated commit on tor/master with one fewer function call | 14:07.41 |
Robin_Watts | I don't see it. | 14:11.12 |
| http://git.ghostscript.com/?p=user/robin/mupdf.git;a=commitdiff;h=5f515997ad00167e55a6d4c48759ebb36e377856 | 14:11.39 |
| My commit uses just 1 function call more than now, but gets you what you want I think. | 14:14.03 |
tor8 | Robin_Watts: right. something more along the lines of what's on tor/master now then? | 14:24.43 |
| basically your commit with fz_pop_try rather than fz_do_catch | 14:24.57 |
| oh, no, that's not going to work... | 14:32.44 |
| ERR cannot be NULL and then used in the always if-clause | 14:33.47 |
Robin_Watts | oh, so the always clause should check for NULL too ? | 14:38.18 |
tor8 | or something like this: | 14:38.57 |
| #define fz_try(ctx) { fz_error_stack_slot *ERR = fz_push_try(ctx); fz_setjmp(ERR->buffer); if (ERR->code == 0) { do | 14:38.59 |
| #define fz_always(ctx) while (0); } if (ERR->code < 3) { ERR->code++; do | 14:38.59 |
| #define fz_catch(ctx) while (0); } } if (fz_pop_try(ctx)) | 14:38.59 |
| always call setjmp, and test the ERR->code explicitly rather than the return value of setjmp | 14:39.22 |
Robin_Watts | fz_setjmp just calls setjmp. It doesn't assign to ERR->code | 14:40.05 |
tor8 | I'm not convinced the contortions and extra statements in every always-block is worth saving one function call for | 14:40.11 |
Robin_Watts | and I don't see how that solves the ERR == NULL case. | 14:40.23 |
tor8 | push_try assigns ERR->code = 0 (or 2 if there was an overflow) | 14:40.24 |
| and always returns the error slot | 14:40.31 |
Robin_Watts | but when we longjmp, what resets ERR->code ? | 14:41.15 |
tor8 | fz_throw sets ERR->code | 14:41.23 |
Robin_Watts | where? | 14:41.50 |
tor8 | the throw() function in source/fitz/error.c | 14:42.18 |
Robin_Watts | throw, right. | 14:42.22 |
| So you're suggesting never having fz_push_try return NULL. | 14:42.48 |
tor8 | yup. | 14:42.55 |
Robin_Watts | not sure I see how that can be achieved. | 14:43.05 |
tor8 | it just returns ctx->error->top | 14:43.08 |
Robin_Watts | it's the overflow case where we return NULL. | 14:43.26 |
tor8 | previously, in the overflow case we return the last item on the ctx->error stack array | 14:43.50 |
| but we don't call 'throw' to longjmp | 14:43.57 |
Robin_Watts | previously? As in the current code? | 14:44.04 |
tor8 | as in the current code, yes. | 14:44.10 |
| push_try returns 1 if we are to 'fake' an exception for the overflow case | 14:44.25 |
Robin_Watts | We don't. We return 1 from fz_try_push, and we never setjmp. | 14:44.33 |
tor8 | but error->top and error->top->code are set as usual | 14:44.35 |
| yeah, the setjmp is irrelevant here | 14:44.43 |
| since we'll never longjmp to it | 14:44.47 |
| in the overflow case | 14:44.54 |
| just reducing the amount of code for each try-statement | 14:45.02 |
Robin_Watts | but that buffer IS in use by our parent nesting. | 14:45.12 |
tor8 | Robin_Watts: look at tor/master now | 14:47.48 |
| NOTHING has changed effectively in how we use the stack slots | 14:48.38 |
| sure, we call setjmp needlessly for the new stack slot. a jmp_buf that was UNINITIALISED in the current code. | 14:49.25 |
| so throwing in fz_always in our current code would be a bad idea if we were to be in the always because of a stack overflow | 14:50.10 |
Robin_Watts | In the old code, in the case of overflow, we were returning 0 from fz_push_try, and consequently we never called fz_setjmp. | 14:50.29 |
tor8 | Yes. | 14:50.36 |
Robin_Watts | So now we are calling fz_setjmp in the overflow case. I need to convince myself that is not a problem. | 14:50.46 |
tor8 | So the ctx->error->top->buffer would be uninitialized. | 14:50.51 |
| Now we initialize it. | 14:51.00 |
Robin_Watts | ok. if we throw in the always, that's safer. | 14:53.36 |
| I cannot immediately see a problem with that. | 14:54.12 |
| and it satisfies my "don't call more functions than we need to" paranoia. | 14:54.25 |
| so, lgtm, assuming it cluster tests OK. | 14:55.21 |
tor8 | I'd still prefer hiding the fz_error_stack_slot struct too... but this is a first step at least :) | 14:56.40 |
Robin_Watts | int *ERRCODE; void *BUF = fz_try_push(ctx, &ERRCODE); fz_setjmp(BUF); if (*ERRCODE == 0) { do | 14:59.08 |
| That hides the fz_error_stack_slot. | 14:59.24 |
| instead of ERR->code, use *ERRCODE | 14:59.36 |
tor8 | it's more the "ERR->code < 3) ERR->code++" details that bother me about exposing the codes and structs | 15:00.33 |
| but what I'm clusterpsuhing now is good enough for me, it lets us hide fz_context eventually | 15:01.00 |
Robin_Watts | The whole implementation bothers me, but it's the best I could come up with! | 15:01.10 |
tor8 | That's why I wanted to hide it behind functions! :) | 15:02.28 |
Robin_Watts | but... speed! :) | 15:02.56 |
tor8 | irrelevant :) | 15:03.09 |
Robin_Watts | You say that now, but when our customers complain... | 15:03.23 |
tor8 | then let's measure and compare! | 15:03.36 |
Robin_Watts | tor8: On a different subject... | 15:12.46 |
| in the pdf interpreter, we call pdf_begin_group/end_group around shades, images, fills, and strokes. | 15:13.11 |
| we don't appear to do the same around text. | 15:13.17 |
| oh, we do, I see it now. | 15:13.47 |
| So, next question... | 15:14.14 |
| In pdf, if overprint is enabled, then blendmode is updated to be 'CompatibleOverprint' for shades/images/fill/strokes/text. | 15:14.55 |
| Is that knowledge that should be pdf specific, or are we happy for it to be fitz specific? | 15:15.21 |
tor8 | I could go either way, whichever ends up cleaner in the code | 15:15.49 |
Robin_Watts | i.e. do I spot it in the pdf interpreter, and alter the blendmode it uses for the group? Or do I spot in in fz_begin_group and change it there? | 15:16.00 |
| I think doing it in fz_begin_group will be nicer. | 15:16.10 |
tor8 | so I think putting it in with the other blend modes in the fz_blendmode thing would be fine | 15:16.13 |
Robin_Watts | (actually in fz_draw_begin_group) | 15:16.18 |
| putting it in the fz_blendmode thing means we need another bit in the enum. That may or may not be problematic - haven't looked at how the list device packs stuff yet. | 15:17.01 |
tor8 | ah, you mean how it's automatically triggered | 15:17.07 |
Robin_Watts | tor8: yeah, I'm tempted to pickle this in as standard behaviour of the draw device. | 15:17.38 |
tor8 | I would imagine it needs to be propagated to the draw device through the display list if we make it one of the public blend modes | 15:17.45 |
| I'd be okay with pickling it into the draw device and hiding it from outside view | 15:18.01 |
Robin_Watts | If we ever have to cope with another format with groups with blend modes that doesn't behave the same, we can revisit the idea then. | 15:18.19 |
| but I think that's unlikely. | 15:18.32 |
tor8 | extremely unlikely | 15:18.58 |
| so, measuring the two approaches (the code that is on tor/master now) and one that calls 4 functions for each try/catch hiding everything from sight | 15:21.32 |
| wall clock time, there's no appreciable difference (timings from run to run vary more than between the versions) | 15:22.03 |
| with cachegrind rendering pdfref17.pdf completely, local-stack-slot runs 27,645,422,624 instructions, and many-funs runs 27,551,268,880 | 15:22.54 |
| that's odd, how can it be so many fewer? | 15:23.03 |
Robin_Watts | no idea. Never played with cachegrind. | 15:34.46 |
| tor8: Urgh. begin_group doesn't get color_params, and I need color_params to decide whether overprint is enabled or not. | 15:37.36 |
| So either I need to pass color_params, or I need to explicitly pass CompatibleOverprint | 15:38.02 |
| balls. I only have 6 bits in the list device for flags. | 15:48.01 |
| blendmode is currently 4 bits + 1 for isolated + 1 for knockout | 15:48.21 |
| so expanding blendmode to 5 bits is no good. | 15:48.32 |
| sorted. | 15:52.21 |
| Forward 1 day (to 2017/09/08)>>> | |