PDF files are made up of a series of objects. These objects can be in many different types, including dictionaries, streams, numbers, booleans, names, strings etc. For full details, see ‘The PDF Reference Manual’.
MuPDF represents all of these as a pdf_obj pointer. Such pointers are reference counted in the usual way:
Given such a pointer, the actual type of the object can be obtained using:
These all return non-zero if the object is of the tested type, and zero otherwise.
To extract the data from a PDF object, you can use one of the following functions:
It is, in fact, safe to call any of these functions on any pdf_obj pointer. If the object is not of the expected type, a ‘safe’ default will be returned.
Array objects consist of lists of other objects, each of which can potentially be of a different type. Accordingly, we have a function to enquire how long a list we have:
Armed with this knowledge we can then fetch any object we want from within the array.
Ideally i should be between 0 and length-1 (though the function will just return NULL if an out of range element is requested).
Note that the pdf_obj reference returned by this function is merely borrowed. That is to say, if you wish to keep the object pointer around for more than the immediate lifespan of the call, you should manually call pdf_keep_obj to keep it, and later pdf_drop_obj to dispose of it.
An object can be inserted into an array at a given index, using:
Any objects after this point are shuffled up the array. Alternatively an object can be put into an array at a given point, overwriting any object that is there already:
If the array needs to be extended it will be, and any intervening objects will be created as ‘null’. Alternatively objects can be appended to an array using:
In all these cases, the array will take new references to the object passed in - that is, after the call, both the array and the caller will hold references to the object. In cases where the object to be inserted is a ‘borrowed’ reference, this is ideal.
In other cases, where the ownership of the object reference should be passed down into the array, we have alternative formulations of those functions:
These functions are so named because they are equivalent to first inserting/putting/pushing the object, and then dropping it, with the nice side effect that any errors encountered during the push still result in the object being correctly dropped, often saving the caller from having to wrap the call in a fz_try/fz_catch clause.