Inbuilt Devices

MuPDF comes with a selection of devices built in, though this should not be taken as a definitive list. It is expected that other devices will be written to extend MuPDF - indeed some embeddings of MuPDF already include their own devices.

9.5.1 BBox Device

The BBox device is a simple device that calculates the bbox of all the marking operations¹ on a page.

/*
   fz_new_bbox_device: Create a device to compute the bounding
   box of all marks on a page.

   The returned bounding box will be the union of all bounding
   boxes of all objects on a page.
*/
fz_device *fz_new_bbox_device(fz_context *ctx, fz_rect *rectp);

The fz_rect passed to the fz_new_bbox_device must obviously stay in scope for the duration of the life of the device as it will be updated when the device is closed with the bounding box for the contents.

9.5.2 Draw Device

The Draw device is the core renderer for MuPDF. Every draw device instance is constructed with a destination Pixmap (see section 10.3 Pixmaps for more details), and each graphical object passed to the device is rendered into that pixmap.

/*
   fz_new_draw_device: Create a device to draw on a pixmap.

   dest: Target pixmap for the draw device. See fz_new_pixmap*
   for how to obtain a pixmap. The pixmap is not cleared by the
   draw device, see fz_clear_pixmap* for how to clear it prior to
   calling fz_new_draw_device. Free the device by calling
   fz_drop_device.
*/
fz_device *fz_new_draw_device(fz_context *ctx, fz_pixmap *dest);

Most of the time we render complete pixmaps, but a mechanism exists to allow us to render a given bbox within a pixmap:

/*
   fz_new_draw_device_with_bbox: Create a device to draw on a pixmap.

   dest: Target pixmap for the draw device. See fz_new_pixmap*
   for how to obtain a pixmap. The pixmap is not cleared by the
   draw device, see fz_clear_pixmap* for how to clear it prior to
   calling fz_new_draw_device. Free the device by calling
   fz_drop_device.

   clip: Bounding box to restrict any marking operations of the
   draw device.
*/
fz_device *fz_new_draw_device_with_bbox(fz_context *ctx, fz_pixmap *dest, const fz_irect *clip);

This can be useful for updating particular areas of a page (for instance when an annotation has been edited or moved) without redrawing the whole thing.

During the course of rendering, the draw device may create new temporary internal pixmaps to cope with transparency and grouping. This is invisible to the caller, and can safely be considered an implementation detail, but should be considered when estimating the memory use for a given rendering operation. The exact number and size of internal pixmaps required depends on the exact complexity and makeup of the graphical objects being displayed.

To limit memory use, a typical strategy is to render pages in bands; rather than creating a single pixmap the size of the page and rendering that, create pixmaps for ’slices’ across the page, and render them one at a time. The memory savings are not just seen in the cost of the basic pixmap, but also serve to limit the sizes of the internal pixmaps used during rendering.

The cost for this is that the page contents do need to be run through repeatedly. This can be achieved by reinterpreting directly from the file, but that can be expensive. The next device provides a route to help with this.

Advanced Rendering - Overprint and Spots

Most formats define pages in terms of some fairly simple ‘well known’ colorspaces, like RGB and CMYK. Some formats (notably PDF) are much more powerful, and allow pages to be constructed with a range of non-standard ‘spot’ inks.

When combined with advanced features such as overprinting, care needs to be taken to ensure that the rendering is exactly as expected.

For example, if a PDF page is constructed to render a page using overprinting it only makes strict sense to render this to a CMYK (or a CMYK + Spots) pixmap. With (say) an RGB pixmap, CMYK colors would be mapped down to RGB as they are plotted, losing the information required to correctly overprint later graphical objects.

Nonetheless, while we might want to get a ‘true’ rendition of the page, we might require it ultimately to appear as an RGB pixmap. As such what we really want is to get a ‘simulation’ of how the overprint would work.

One way to work would be to call the draw device and request a CMYK + Spots rendering, and then to require the caller to convert this to their desired target colorspace manually. This is not in keeping with the general desire in MuPDF to encapsulate functionality in a friendly way.

Therefore, the draw device examines the ‘separations’ field of the pixmap that it is called with to decide how to render.

If there is no separations value supplied (i.e. it is NULL), then the draw device assumes that no form of overprint (or overprint simulation) is required.

If there is a separations value, and there is at least one separation that is not entirely disabled, then the draw device will draw internally to a CMYK + Spots pixmap (where the spots are the non-disabled separations from the separations value). This rendering can safely proceed with overprint processing enabled.

At the end of the render, the draw device will convert down from the CMYK + Spots pixmap to the colorspace of the initial pixmap. Any spot colorants present in the initial pixmap will be populated from the rendered one; any spots that aren’t will be converted down to process colors.

Thus by creating the initial pixmap passed into the draw device using a separations object with the colorants correctly set to be composite/spots/disabled as required, overprint or overprint simulation can be controlled as required.

9.5.3 Display List Device

The Display list device simply records all the calls made to it in a list. This list can then be played back later, potentially multiple times and with different transforms, to other devices.

/*
   fz_new_list_device: Create a rendering device for a display list.

   When the device is rendering a page it will populate the
   display list with drawing commsnds (text, images, etc.). The
   display list can later be reused to render a page many times
   without having to re-interpret the page from the document file
   for each rendering. Once the device is no longer needed, free
   it with fz_drop_device.

   list: A display list that the list device takes ownership of.
*/
fz_device *fz_new_list_device(fz_context *ctx, fz_display_list *list);

9.5.4 PDF Output Device

The PDF Output device is still a work in progress, as its handling of fonts is incomplete. Nonetheless for certain classes of files it can be useful.

End users will probably prefer to use the document writer interface (see chapter 15 The Document Writer interface) which wraps this class up, rather than call it directly. Nonetheless this can be useful in specific circumstances when generating particular sections of a PDF file (such as appearance streams for annotations).

The PDF Output device takes the sequence of graphical operations it is called with, and forms it back into a sequence of PDF operations, together with a set of required resources. These can then be formed into a completely new PDF page (or a PDF annotation) which can then be inserted into a document.

/*
   pdf_page_write: Create a device that will record the
   graphical operations given to it into a sequence of
   pdf operations, together with a set of resources. This
   sequence/set pair can then be used as the basis for
   adding a page to the document (see pdf_add_page).

   doc: The document for which these are intended.

   mediabox: The bbox for the created page.

   presources: Pointer to a place to put the created
   resources dictionary.

   pcontents: Pointer to a place to put the created
   contents buffer.
*/
fz_device *pdf_page_write(fz_context *ctx, pdf_document *doc, const fz_rect *mediabox, pdf_obj **presources, fz_buffer **pcontents);

9.5.5 Structured Text Device

The Structured Text device is used to extract the text from a given graphical stream, together with the position it inhabits on the output page. It can also optionally include details of images and their positions within its output.

/*
   fz_new_stext_device: Create a device to extract the text on a page.

   Gather and sort the text on a page into spans of uniform style,
   arranged into lines and blocks by reading order. The reading order
   is determined by various heuristics, so may not be accurate.

   sheet: The text sheet to which styles should be added. This can
   either be a newly created (empty) text sheet, or one containing
   styles from a previous text device. The same sheet cannot be used
   in multiple threads simultaneously.

   page: The text page to which content should be added. This will
   usually be a newly created (empty) text page, but it can be one
   containing data already (for example when merging multiple pages, or
   watermarking).
*/
fz_device *fz_new_stext_device(fz_context *ctx, fz_stext_sheet *sheet, fz_stext_page *page);

This can be used as the basis for searching (including highlighting the text as matches are found), for exporting text files (or text and image based files such as HTML), or even to do more complex page analysis (such as spotting what regions of the page are text, what are graphics etc).

An (initially empty) fz_stext_sheet should be created using fz_new_stext_sheet, and an empty fz_stext_page created using fz_new_stext_page. These are used in the call to fz_new_stext_device. After the contents have been run to that device, the sheet will be populated with the common styles used by the page, and the page will be populated with details of the text extracted and its position.

9.5.6 SVG Output Device

End users will probably prefer to use the document writer interface (see chapter 15 The Document Writer interface) which wraps this class up, rather than call it directly.

/*
   fz_new_svg_device: Create a device that outputs (single page)
   SVG files to the given output stream.

   output: The output stream to send the constructed SVG page
   to.

   page_width, page_height: The page dimensions to use (in
   points).
*/
fz_device *fz_new_svg_device(fz_context *ctx, fz_output *out, float page_width, float page_height);

The device currently generates SVG 1.1 compliant files. SVG Fonts are NOT used due to poor client support. Instead glyphs are sent as reusable symbols. Shadings are sent as rasterised images. JPEGs will be passed through unchanged, and all other images will be converted to PNG.

9.5.7 Test Device

The Test device, as its name suggests, tests a given set of page contents for which features are used. Currently this is restricted to testing for whether the graphical objects used are greyscale or colour. Testing for additional features may be added in future.

/*
   fz_new_test_device: Create a device to test for features.

   Currently only tests for the presence of non-grayscale colors.

   is_color: Possible values returned:
      0: Definitely greyscale
      1: Probably color (all colors were grey, but there
      were images or shadings in a non grey colorspace).
      2: Definitely color

   threshold: The difference from grayscale that will be tolerated.
   Typical values to use are either 0 (be exact) and 0.02 (allow an
   imperceptible amount of slop).

   options: A set of bitfield options, from the FZ_TEST_OPT set.

   passthrough: A device to pass all calls through to, or NULL.
   If set, then the test device can both test and pass through to
   an underlying device (like, say, the display list device). This
   means that a display list can be created and at the end we’ll
   know if its color or not.

   In the absence of a passthrough device, the device will throw
   an exception to stop page interpretation when color is found.
*/
fz_device *fz_new_test_device(fz_context *ctx, int *is_color, float threshold, int options, fz_device *passthrough);

The expected purpose of the colour detecting functionality is to allow applications (e.g. printers) to easily detect if a given page requires the use of colour inks, or whether a greyscale rendering will suffice.

This device can either be used by itself, or in the form of a pass-through device.

Standalone use

In the simplest form, the device can be created standalone, by passing passthrough as NULL.

As each subsequent device call is made, the device will test the graphic object passed to it to see if it is within the given threshold of being a neutral colour. If it is, then the device continues. If not, then it sets the int pointed to by is_color to be non zero.

For graphical objects such as paths or text, this is an easy evaluation that takes almost no time. For Images or Shadings however, it is slightly trickier. An image may be defined in a colour space capable of non-neutral colours (perhaps RGB or CMYK) and yet the image itself may only use neutral colours within that space. To properly establish whether colours are required or not, requires much more CPU intensive processing.

Accordingly, the device will, by default, just look at the colour space. The value of is_color returned at the end may be examined to establish the confidence level of the test. 0 means “definitely greyscale”, 1 means “probably colour” (i.e. “an image or shading was seen that potentially contains non neutral colours”), and 2 means “definitely colour”.

If the caller wishes to spend the CPU cycles to get a definite answer, options can be set to FZ_TEXT_OPT_IMAGES | FZ_TEXT_OPT_SHADINGS and images and shadings will be exhaustively checked.

As an optimisation, given how much faster is is to check non-images and shadings, it can be worth running the device once without the options set, and then only running it again with them set if required.

If the device is run with passthrough as NULL, then as soon as it encounters a “definite” non-neutral colour, it will throw a FZ_ABORT error. This can save a considerable amount of time, as it avoids the interpreter needing to run through an entire page when observation of one of the very first graphical operations is enough to know that colour is being used.

Passthrough use

As discussed above, the envisaged use case for this device is to detect whether page contents require colour or not to allow printers to decide whether to rasterise for colour inks or a faster/cheaper greyscale pass.

Such printers will normally be operating in banded mode, which requires (or at least greatly benefits from) the use of a display list. By using the device in passthrough mode, the testing can be performed at the same time as the list is built.

Simply create the display list device as you would normally, and pass it into fz_new_test_device as passthrough. Then run the page contents through the returned test device. The test device will pass each call through to the underlying list device and so the display list be built as normal.

When run in this mode, the device can no longer use the ‘early-exit’ optimisation of throwing a FZ_ABORT error.

9.5.8 Trace Device

The Trace device is a simple debugging device that allows an XML-like representation of the device calls made to be output.

9.5 Inbuilt Devices