20.2 Implementing a Document Handler
20.2.1 Recognize and Open
To implement a new document handler, a new fz_document_handler structure is
required. There are 3 components to such a structure, all function pointers:
typedef struct fz_document_handler_s { fz_document_recognize_fn *recognize; fz_document_open_fn *open; fz_document_open_with_stream_fn *open_with_stream; } fz_document_handler;
The first is a function to recognize a document from a magic string, typically a
mimetype or a filename:
/* fz_document_recognize_fn: Recognize a document type from a magic string. magic: string to recognise - typically a filename or mime type. Returns a number between 0 (not recognized) and 100 (fully recognized) based on how certain the recognizer is that this is of the required type. */ typedef int (fz_document_recognize_fn)(fz_context *ctx, const char *magic);
The second is a function to open a document from a filename:
/* fz_document_open_fn: Function type to open a document from a file. filename: file to open Pointer to opened document. Throws exception in case of error. */ typedef fz_document *(fz_document_open_fn)(fz_context *ctx, const char *filename);
This function can permissibly be NULL, as it can be synthesized automatically from
the third entry, a function to open a document from a stream:
/* fz_document_open_with_stream_fn: Function type to open a document from a file. stream: fz_stream to read document data from. Must be seekable for formats that require it. Pointer to opened document. Throws exception in case of error. */ typedef fz_document *(fz_document_open_with_stream_fn)(fz_context *ctx, fz_stream *stream);
To create a fz_document use the fz_new_document macro. For a document of type
foo, typically a foo_document structure would be defined as below:
typedef struct { fz_document super; <foo specific fields> } foo_document;
This would then be created using a call to fz_new_document, such as:
foo_document *foo = fz_new_document(ctx, foo_document);
This returns an empty document structure with super populated with default values,
and the foo specific fields initialized to 0. The document handler then needs to fill in
the document level functions.
20.2.2 Document Level Functions
The fz_document structure contains a list of functions used to implement the
document level calls:
typedef struct fz_document_s { int refs; fz_document_drop_fn *drop_document; fz_document_needs_password_fn *needs_password; fz_document_authenticate_password_fn *authenticate_password; fz_document_has_permission_fn *has_permission; fz_document_load_outline_fn *load_outline; fz_document_layout_fn *layout; fz_document_make_bookmark_fn *make_bookmark; fz_document_lookup_bookmark_fn *lookup_bookmark; fz_document_resolve_link_fn *resolve_link; fz_document_count_pages_fn *count_pages; fz_document_load_page_fn *load_page; fz_document_lookup_metadata_fn *lookup_metadata; int did_layout; int is_reflowable; } fz_document;
Implementations must fill in the drop_document field, with a pointer to a function
called to free any resources help by the document when the reference count drops to
0. In the unlikely event that your implementation has no resources, this field can be
left NULL.
/* fz_document_drop_fn: Called when the reference count for the fz_document drops to 0. The implementation should release any resources held by the document. The actual document pointer will be freed by the caller. */ typedef void (fz_document_drop_fn)(fz_context *ctx, fz_document *doc);
If your document handler is capable of handling password protected documents, then
you must fill in the needs_password field with a pointer to a function called to
enquire whether a given document needs a password:
/* fz_document_needs_password_fn: Type for a function to be called to enquire whether the document needs a password or not. See fz_needs_password for more information. */ typedef int (fz_document_needs_password_fn)(fz_context *ctx, fz_document *doc);
If your document handler is capable of handling password protected documents, then
you must fill in the authenticate_password field with a pointer to a function called
to attempt to authenticate a password:
/* fz_document_authenticate_password_fn: Type for a function to be called to attempt to authenticate a password. See fz_authenticate_password for more information. */ typedef int (fz_document_authenticate_password_fn)(fz_context *ctx, fz_document *doc, const char *password);
Certain document types encode permissions within them to say what users are
allowed to do with them (printing, extracting etc). If your document handler’s format
has this concept, then you must fill in the has_permission field with a pointer to a
function called to attempt to query such permissions:
/* fz_document_has_permission_fn: Type for a function to be called to see if a document grants a certain permission. See fz_document_has_permission for more information. */ typedef int (fz_document_has_permission_fn)(fz_context *ctx, fz_document *doc, fz_permission permission);
Certain document types can optionally include outline (table of contents) information
within them. If your document handler’s format has this concept, then you must fill
in the load_outline field with a pointer to a function called to attempt to load such
information if it is there:
/* fz_document_load_outline_fn: Type for a function to be called to load the outlines for a document. See fz_document_load_outline for more information. */ typedef fz_outline *(fz_document_load_outline_fn)(fz_context *ctx, fz_document *doc);
If your document format requires a layout pass before it can be viewed, then you
must fill in the layout field with a pointer to a function called to perform such a
layout:
/* fz_document_layout_fn: Type for a function to be called to lay out a document. See fz_layout_document for more information. */ typedef void (fz_document_layout_fn)(fz_context *ctx, fz_document *doc, float w, float h, float em);
If your document requires a layout pass, you should provide functions to both make
and resolve bookmarks to enable reader positions to be kept over layout changes.
Accordingly the make_bookmark and lookup_bookmark fields should be filled
out:
/* fz_document_make_bookmark_fn: Type for a function to make a bookmark. See fz_make_bookmark for more information. */ typedef fz_bookmark (fz_document_make_bookmark_fn)(fz_context *ctx, fz_document *doc, int page); /* fz_document_lookup_bookmark_fn: Type for a function to lookup a bookmark. See fz_lookup_bookmark for more information. */ typedef int (fz_document_lookup_bookmark_fn)(fz_context *ctx, fz_document *doc, fz_bookmark mark);
Some document formats can encode internal links that point to another page in the
document. If your document supports this concept, then you must fill in the
resolve_link field with a pointer to a function called to resolve a textual link to a
page number, and location on that page:
/* fz_document_resolve_link_fn: Type for a function to be called to resolve an internal link to a page number. See fz_resolve_link for more information. */ typedef int (fz_document_resolve_link_fn)(fz_context *ctx, fz_document *doc, const char *uri, float *xp, float *yp);
All document formats must fill in the count_pages field with a pointer to a function
called to return the number of pages in a document:
/* fz_document_count_pages_fn: Type for a function to be called to count the number of pages in a document. See fz_count_pages for more information. */ typedef int (fz_document_count_pages_fn)(fz_context *ctx, fz_document *doc);
Different document formats encode different types of metadata. We therefore have an
extensible function to allow such data to be queried. If your document handler wishes
to support this, then the lookup_metadata field must be filled in with a pointer to a
function to perform such lookups:
/* fz_document_lookup_metadata_fn: Type for a function to query a documents metadata. See fz_lookup_metadata for more information. */ typedef int (fz_document_lookup_metadata_fn)(fz_context *ctx, fz_document *doc, const char *key, char *buf, int size);
All document formats must fill in the load_page field with a pointer to a function
called to return a reference to a fz_page structure:
/* fz_document_load_page_fn: Type for a function to load a given page from a document. See fz_load_page for more information. */ typedef fz_page *(fz_document_load_page_fn)(fz_context *ctx, fz_document *doc, int number);
To create a fz_page use the fz_new_page macro. For a document of type foo,
typically a foo_page structure would be defined as below:
typedef struct { fz_page super; <foo specific fields> } foo_page;
This would then be created using a call to fz_new_page, such as:
foo_page *foo = fz_new_page(ctx, foo_page);
This returns an empty document structure with super populated with default values,
and the foo specific fields initialized to 0. The document handler implementation then
needs to fill in the page level functions.
20.2.3 Page Level Functions
The fz_page structure contains a list of functions used to implement the page level
calls:
typedef struct fz_page_s { int refs; fz_page_drop_page_fn *drop_page; fz_page_bound_page_fn *bound_page; fz_page_run_page_contents_fn *run_page_contents; fz_page_load_links_fn *load_links; fz_page_first_annot_fn *first_annot; fz_page_page_presentation_fn *page_presentation; fz_page_control_separation_fn *control_separation; fz_page_separation_disabled_fn *separation_disabled; fz_page_count_separations_fn *count_separations; fz_page_get_separation_fn *get_separation; } fz_page;
The fz_page (and hence derived foo_page) structures are reference counted. The
refs field is used to keep the reference count in. All the reference counting is handled
by the core library, and all that is required of the implementation is that it should
supply a drop_page function that will be called when the reference count reaches
zero. This is of type:
/* fz_page_drop_page_fn: Type for a function to release all the resources held by a page. Called automatically when the reference count for that page reaches zero. */ typedef void (fz_page_drop_page_fn)(fz_context *ctx, fz_page *page);
Implementations must fill in the bound_page field with the address of a function to
return the pages bounding box, of type:
/* fz_page_bound_page_fn: Type for a function to return the bounding box of a page. See fz_bound_page for more information. */ typedef fz_rect *(fz_page_bound_page_fn)(fz_context *ctx, fz_page *page, fz_rect *);
Implementations must fill in the run_page_contents field with the address of a
function to interpret the contents of a page, of type:
/* fz_page_run_page_contents_fn: Type for a function to run the contents of a page. See fz_run_page_contents for more information. */ typedef void (fz_page_run_page_contents_fn)(fz_context *ctx, fz_page *page, fz_device *dev, const fz_matrix *transform, fz_cookie *cookie);
If a document format supports internal or external hyperlinks, then its implementation
must fill in the load_links field with the address of a function to load the links from
a page, of type:
/* fz_page_load_links_fn: Type for a function to load the links from a page. See fz_load_links for more information. */ typedef fz_link *(fz_page_load_links_fn)(fz_context *ctx, fz_page *page);
If a document format supports annotations, then its implementation must fill in the
first_annot field with the address of a function to load the annotations from a
page, of type:
/* fz_page_first_annot_fn: Type for a function to load the annotations from a page. See fz_first_annot for more information. */ typedef fz_annot *(fz_page_first_annot_fn)(fz_context *ctx, fz_page *page);
Some document formats can encode information that specifies how pages
should be presented to the user as a slideshow - how long they should be
displayed, and which transition to use when moving to the next page etc. In
implementations of document handlers for such formats, they should fill in the
page_presentation field with the address of a function to obtain this information, of
type:
/* fz_page_page_presentation_fn: Type for a function to obtain the details of how this page should be presented when in presentation mode. See fz_page_presentation for more information. */ typedef fz_transition *(fz_page_page_presentation_fn)(fz_context *ctx, fz_page *page, fz_transition *transition, float *duration);
Some document formats can encapsulate multiple color separations. In order to allow
proofing of such formats, MuPDF allows such separations to be enumerated and
enabled/disabled. In document handlers for such document formats, the
control_separation, separation_disabled, count_separations and
get_separation fields should be filled in with functions of the following types
respectively:
/* fz_page_control_separation: Type for a function to enable/ disable separations on a page. See fz_control_separation for more information. */ typedef void (fz_page_control_separation_fn)(fz_context *ctx, fz_page *page, int separation, int disable); /* fz_page_separation_disabled_fn: Type for a function to detect whether a given separation is enabled or disabled on a page. See fz_separation_disabled for more information. */ typedef int (fz_page_separation_disabled_fn)(fz_context *ctx, fz_page *page, int separation); /* fz_page_count_separations_fn: Type for a function to count the number of separations on a page. See fz_count_separations for more information. */ typedef int (fz_page_count_separations_fn)(fz_context *ctx, fz_page *page); /* fz_page_get_separation_fn: Type for a function to retrieve details of a separation on a page. See fz_get_separation for more information. */ typedef const char *(fz_page_get_separation_fn)(fz_context *ctx, fz_page *page, int separation, uint32_t *rgb, uint32_t *cmyk);