MuPDF handles all its errors using an exception system. This is superficially similar to C++ exceptions, but (as MuPDF is written in C) it is implemented using macros that wrap the setjmp/longjmp standard C functions.
It is probably best not to peek behind the curtain, and just to think of these constructs as being extensions to the language. Indeed, we have worked very hard to ensure that the complexities involved are minimised.
Unless otherwise specified, all MuPDF API functions can throw exceptions, and should therefore be called within a fz_try/fz_always/fz_catch construct.
Specific functions that never throw exceptions include all those named fz_keep_..., fz_drop_... and fz_free. This, coupled with the fact that all such ‘destructor’ functions will silently accept a NULL argument, makes the fz_always block an excellent place to clean up resources used throughout processing.
The general anatomy of such a construct is as follows:
The fz_always block is completely optional. The following is perfectly valid:
In an ideal world, that would be all there is to it. Unfortunately, there are 2 wrinkles.
The first one, relatively simple, is that you must not return from within a fz_try block. To do so will corrupt the exception stack and cause problems and crashes. To mitigate this, you can safely break out of the fz_try, and execution will pass into the fz_always block (if there is one, or continue after the fz_catch block if not).
Similarly, you can break out of a fz_always block, and execution will correctly pass into or after the fz_catch block as appropriate, but this is less useful in practise.
The second one, is more convoluted. If you do not wish to understand the long and complex reasons behind this, skip the following subsection, and just read the corrected example that follows. As long as you follow the rules given in the summary at the end, you will be fine.
As stated before fz_try/fz_catch are implemented using setjmp/longjmp, and these can ‘lose’ changes to variables.
For example:
In the above code (as well as throughout MuPDF), we follow the convention that destructors always accept NULL. This makes cleanup code much simpler.
Reading through this code, it is fairly obvious what will happen if everything works correctly. First we’ll make some walls, w, and a roof, r. Then we combine the walls and the roof, to get our house, h. As part of this process, the house would take references to the walls and roof as required. Next we tidy up our local references to the walls and the roof, and we return the completed house to our caller.
It’s more interesting to consider what will happen if we have failures.
First let’s consider what happens if the make_walls fails. This will fz_throw an exception, and control will jump immediately to the fz_always. This will drop w and r (both of which are still NULL). The fz_catch can then handle the error, either by returning NULL, to indicate failure, or perhaps by fz_rethrowing the error to an enclosing fz_try/fz_catch construct. No problems there.
So what happens when the failure occurs in make_roof? Let’s run through the code again.
This time, make_walls succeeds, and w is set to this new value. Then make_roof fails, fz_throwing an exception, and control will jump immediately to the fz_always. This will then try to drop w (now a valid value) and r (which is still NULL). The fz_catch can then handle the error, either by returning NULL, to indicate failure, or perhaps by fz_rethrowing the error to an enclosing fz_try/fz_catch construct. All sounds quite plausible.
Unfortunately, if you try it, on some systems you will find that you have a memory leak (or worse). When drop_walls is called, sometimes you will find that w has ‘lost’ its value.
This is due to an obscure part of the C specification that states that any changes to the values of local variables made between a setjmp and a longjmp can be lost. (In fact, the C specification goes further than this, and says that such variables become ‘undefined’).
In fz_try/fz_catch terms, this means that any local variables set within the fz_try block can be ‘lost’ when either fz_always or fz_catch are reached.
Fortunately, there is a fix for this, fz_var. By calling fz_var(w); before the fz_try we can ‘protect’ variable w from such unwanted behaviour.
It’s not really necessary to know how this works, but for those interested, a quick explanation. The ‘loss’ of the value occurs because the compiler can postpone writing the value back into the storage location for the variable (or can choose to just hold it in a register). The call to fz_var passes the address of the variable out of scope; this forces the compiler not to hold it in a register. Further, the compiler has no way of knowing whether any functions it calls might access that location, so it needs to make sure that the variable value is written back on every function call - such as longjmp. Hence the variable is magically protected, and is guaranteed not to lose its value, whether an exception is thrown or not.
Calls to fz_var are very low cost (but are not NOPs), so erring on the side of caution and calling fz_var on more than you need to will probably not hurt.
A corrected version of the above example is therefore:
Note the calls to fz_var. These warn the compiler that it should take care not to lose updates to w or r if an exception is thrown in the fz_try. See Rule 5 in section 6.4 Summary below.