6.1 Overview

MuPDF handles all its errors using an exception system. This is superficially similar to C++ exceptions, but (as MuPDF is written in C) it is implemented using macros that wrap the setjmp/longjmp standard C functions.

It is probably best not to peek behind the curtain, and just to think of these constructs as being extensions to the language. Indeed, we have worked very hard to ensure that the complexities involved are minimised.

Unless otherwise specified, all MuPDF API functions can throw exceptions, and should therefore be called within a fz_try/fz_always/fz_catch construct.

Specific functions that never throw exceptions include all those named fz_keep_..., fz_drop_... and fz_free. This, coupled with the fact that all such ‘destructor’ functions will silently accept a NULL argument, makes the fz_always block an excellent place to clean up resources used throughout processing.

The general anatomy of such a construct is as follows:

fz_try(ctx) 
{ 
   /* Do stuff in here that might throw an exception. 
   * NEVER return from here. break can be used to 
   * continue execution (either in the always block or 
   * after the catch block). */ 
} 
fz_always(ctx) 
{ 
   /* Anything in here will always be executed, regardless 
   * of whether the fz_try clause exited normally, or an 
   * exception was thrown. Try to avoid calling functions 
   * that can themselves throw exceptions here, or the rest 
   * of the fz_always block will be skipped - this is rarely 
   * what is wanted! NEVER return from here. break can be 
   * used to continue execution in, or after the catch block 
   * as appropriate. */ 
} 
fz_catch(ctx) 
{ 
   /* This block will execute if (and only if) anything in 
   * the fz_try block calls fz_throw. We should clean up 
   * anything we need to. If we are in a nested fz_try/ 
   * fz/catch block, we can call fz_rethrow to propagate 
   * the error to the enclosing catch. Unless the exception 
   * is rethrown (or a fresh exception thrown), execution 
   * continues after this block. */ 
}

The fz_always block is completely optional. The following is perfectly valid:

fz_try(ctx) 
{ 
   /* Do stuff here */ 
} 
fz_catch(ctx) 
{ 
   /* Clean up from errors here */ 
}

In an ideal world, that would be all there is to it. Unfortunately, there are 2 wrinkles.

The first one, relatively simple, is that you must not return from within a fz_try block. To do so will corrupt the exception stack and cause problems and crashes. To mitigate this, you can safely break out of the fz_try, and execution will pass into the fz_always block (if there is one, or continue after the fz_catch block if not).

Similarly, you can break out of a fz_always block, and execution will correctly pass into or after the fz_catch block as appropriate, but this is less useful in practise.

The second one, is more convoluted. If you do not wish to understand the long and complex reasons behind this, skip the following subsection, and just read the corrected example that follows. As long as you follow the rules given in the summary at the end, you will be fine.

6.1.1 Why is fz_var necessary?

As stated before fz_try/fz_catch are implemented using setjmp/longjmp, and these can ‘lose’ changes to variables.

For example:

house_t *build_house(fz_context *ctx) 
{ 
   walls_t *w = NULL; 
   roof_t  *r = NULL; 
   house_t *h = NULL; 
 
   fz_try(ctx) 
   { 
      w = make_walls(); 
      r = make_roof(); 
      h = combine(w, r); /* Note, NOT: return combine(w,r); */ 
   } 
   fz_always(ctx) 
   { 
      drop_walls(w); 
      drop_roof(r); 
   } 
   fz_catch(ctx) 
   { 
      /* Handle the error somehow. If we are nested within another 
      * layer of fz_try/fz_catch, we can simply fz_rethrow. If 
      * not, handle it in a way appropriate for this application, 
      * perhaps by simply returning NULL. */ 
      return NULL; 
   } 
   return h; 
}

In the above code (as well as throughout MuPDF), we follow the convention that destructors always accept NULL. This makes cleanup code much simpler.

Reading through this code, it is fairly obvious what will happen if everything works correctly. First we’ll make some walls, w, and a roof, r. Then we combine the walls and the roof, to get our house, h. As part of this process, the house would take references to the walls and roof as required. Next we tidy up our local references to the walls and the roof, and we return the completed house to our caller.

It’s more interesting to consider what will happen if we have failures.

First let’s consider what happens if the make_walls fails. This will fz_throw an exception, and control will jump immediately to the fz_always. This will drop w and r (both of which are still NULL). The fz_catch can then handle the error, either by returning NULL, to indicate failure, or perhaps by fz_rethrowing the error to an enclosing fz_try/fz_catch construct. No problems there.

So what happens when the failure occurs in make_roof? Let’s run through the code again.

This time, make_walls succeeds, and w is set to this new value. Then make_roof fails, fz_throwing an exception, and control will jump immediately to the fz_always. This will then try to drop w (now a valid value) and r (which is still NULL). The fz_catch can then handle the error, either by returning NULL, to indicate failure, or perhaps by fz_rethrowing the error to an enclosing fz_try/fz_catch construct. All sounds quite plausible.

Unfortunately, if you try it, on some systems you will find that you have a memory leak (or worse). When drop_walls is called, sometimes you will find that w has ‘lost’ its value.

This is due to an obscure part of the C specification that states that any changes to the values of local variables made between a setjmp and a longjmp can be lost. (In fact, the C specification goes further than this, and says that such variables become ‘undefined’).

In fz_try/fz_catch terms, this means that any local variables set within the fz_try block can be ‘lost’ when either fz_always or fz_catch are reached.

Fortunately, there is a fix for this, fz_var. By calling fz_var(w); before the fz_try we can ‘protect’ variable w from such unwanted behaviour.

It’s not really necessary to know how this works, but for those interested, a quick explanation. The ‘loss’ of the value occurs because the compiler can postpone writing the value back into the storage location for the variable (or can choose to just hold it in a register). The call to fz_var passes the address of the variable out of scope; this forces the compiler not to hold it in a register. Further, the compiler has no way of knowing whether any functions it calls might access that location, so it needs to make sure that the variable value is written back on every function call - such as longjmp. Hence the variable is magically protected, and is guaranteed not to lose its value, whether an exception is thrown or not.

Calls to fz_var are very low cost (but are not NOPs), so erring on the side of caution and calling fz_var on more than you need to will probably not hurt.

6.1.2 Example: How to protect local variables with fz_var

A corrected version of the above example is therefore:

house_t *build_house(fz_context *ctx) 
{ 
   walls_t *w = NULL; 
   roof_t  *r = NULL; 
   house_t *h = NULL; 
 
   fz_var(w); 
   fz_var(r); 
 
   fz_try(ctx) 
   { 
      w = make_walls(); 
      r = make_roof(); 
      h = combine(w, r); /* Note, NOT: return combine(w,r); */ 
   } 
   fz_always(ctx) 
   { 
      drop_walls(w); 
      drop_roof(r); 
   } 
   fz_catch(ctx) 
   { 
      /* Handle the error somehow. If we are nested within another 
      * layer of fz_try/fz_catch, we can simply fz_rethrow. If 
      * not, handle it in a way appropriate for this application, 
      * perhaps by simply returning NULL. */ 
      return NULL; 
   } 
   return h; 
}

Note the calls to fz_var. These warn the compiler that it should take care not to lose updates to w or r if an exception is thrown in the fz_try. See Rule 5 in section 6.4 Summary below.