MuPDF comes with a selection of devices built in, though this should not be taken as a definitive list. It is expected that other devices will be written to extend MuPDF - indeed some embeddings of MuPDF already include their own devices.
The BBox device is a simple device that calculates the bbox of all the marking operations1 on a page.
The fz_rect passed to the fz_new_bbox_device must obviously stay in scope for the duration of the life of the device as it will be updated when the device is closed with the bounding box for the contents.
The Draw device is the core renderer for MuPDF. Every draw device instance is constructed with a destination Pixmap (see section 10.3 Pixmaps for more details), and each graphical object passed to the device is rendered into that pixmap.
Most of the time we render complete pixmaps, but a mechanism exists to allow us to render a given bbox within a pixmap:
This can be useful for updating particular areas of a page (for instance when an annotation has been edited or moved) without redrawing the whole thing.
During the course of rendering, the draw device may create new temporary internal pixmaps to cope with transparency and grouping. This is invisible to the caller, and can safely be considered an implementation detail, but should be considered when estimating the memory use for a given rendering operation. The exact number and size of internal pixmaps required depends on the exact complexity and makeup of the graphical objects being displayed.
To limit memory use, a typical strategy is to render pages in bands; rather than creating a single pixmap the size of the page and rendering that, create pixmaps for ’slices’ across the page, and render them one at a time. The memory savings are not just seen in the cost of the basic pixmap, but also serve to limit the sizes of the internal pixmaps used during rendering.
The cost for this is that the page contents do need to be run through repeatedly. This can be achieved by reinterpreting directly from the file, but that can be expensive. The next device provides a route to help with this.
Most formats define pages in terms of some fairly simple ‘well known’ colorspaces, like RGB and CMYK. Some formats (notably PDF) are much more powerful, and allow pages to be constructed with a range of non-standard ‘spot’ inks.
When combined with advanced features such as overprinting, care needs to be taken to ensure that the rendering is exactly as expected.
For example, if a PDF page is constructed to render a page using overprinting it only makes strict sense to render this to a CMYK (or a CMYK + Spots) pixmap. With (say) an RGB pixmap, CMYK colors would be mapped down to RGB as they are plotted, losing the information required to correctly overprint later graphical objects.
Nonetheless, while we might want to get a ‘true’ rendition of the page, we might require it ultimately to appear as an RGB pixmap. As such what we really want is to get a ‘simulation’ of how the overprint would work.
One way to work would be to call the draw device and request a CMYK + Spots rendering, and then to require the caller to convert this to their desired target colorspace manually. This is not in keeping with the general desire in MuPDF to encapsulate functionality in a friendly way.
Therefore, the draw device examines the ‘separations’ field of the pixmap that it is called with to decide how to render.
If there is no separations value supplied (i.e. it is NULL), then the draw device assumes that no form of overprint (or overprint simulation) is required.
If there is a separations value, and there is at least one separation that is not entirely disabled, then the draw device will draw internally to a CMYK + Spots pixmap (where the spots are the non-disabled separations from the separations value). This rendering can safely proceed with overprint processing enabled.
At the end of the render, the draw device will convert down from the CMYK + Spots pixmap to the colorspace of the initial pixmap. Any spot colorants present in the initial pixmap will be populated from the rendered one; any spots that aren’t will be converted down to process colors.
Thus by creating the initial pixmap passed into the draw device using a separations object with the colorants correctly set to be composite/spots/disabled as required, overprint or overprint simulation can be controlled as required.
The Display list device simply records all the calls made to it in a list. This list can then be played back later, potentially multiple times and with different transforms, to other devices.
For more details of the uses of Display Lists, see chapter 11 Display Lists.
The PDF Output device is still a work in progress, as its handling of fonts is incomplete. Nonetheless for certain classes of files it can be useful.
End users will probably prefer to use the document writer interface (see chapter 15 The Document Writer interface) which wraps this class up, rather than call it directly. Nonetheless this can be useful in specific circumstances when generating particular sections of a PDF file (such as appearance streams for annotations).
The PDF Output device takes the sequence of graphical operations it is called with, and forms it back into a sequence of PDF operations, together with a set of required resources. These can then be formed into a completely new PDF page (or a PDF annotation) which can then be inserted into a document.
The Structured Text device is used to extract the text from a given graphical stream, together with the position it inhabits on the output page. It can also optionally include details of images and their positions within its output.
This can be used as the basis for searching (including highlighting the text as matches are found), for exporting text files (or text and image based files such as HTML), or even to do more complex page analysis (such as spotting what regions of the page are text, what are graphics etc).
An (initially empty) fz_stext_sheet should be created using fz_new_stext_sheet, and an empty fz_stext_page created using fz_new_stext_page. These are used in the call to fz_new_stext_device. After the contents have been run to that device, the sheet will be populated with the common styles used by the page, and the page will be populated with details of the text extracted and its position.
The SVG output device is used to generate SVG pages from arbitrary input.
End users will probably prefer to use the document writer interface (see chapter 15 The Document Writer interface) which wraps this class up, rather than call it directly.
The device currently generates SVG 1.1 compliant files. SVG Fonts are NOT used due to poor client support. Instead glyphs are sent as reusable symbols. Shadings are sent as rasterised images. JPEGs will be passed through unchanged, and all other images will be converted to PNG.
The Test device, as its name suggests, tests a given set of page contents for which features are used. Currently this is restricted to testing for whether the graphical objects used are greyscale or colour. Testing for additional features may be added in future.
The expected purpose of the colour detecting functionality is to allow applications (e.g. printers) to easily detect if a given page requires the use of colour inks, or whether a greyscale rendering will suffice.
This device can either be used by itself, or in the form of a pass-through device.
In the simplest form, the device can be created standalone, by passing passthrough as NULL.
As each subsequent device call is made, the device will test the graphic object passed to it to see if it is within the given threshold of being a neutral colour. If it is, then the device continues. If not, then it sets the int pointed to by is_color to be non zero.
For graphical objects such as paths or text, this is an easy evaluation that takes almost no time. For Images or Shadings however, it is slightly trickier. An image may be defined in a colour space capable of non-neutral colours (perhaps RGB or CMYK) and yet the image itself may only use neutral colours within that space. To properly establish whether colours are required or not, requires much more CPU intensive processing.
Accordingly, the device will, by default, just look at the colour space. The value of is_color returned at the end may be examined to establish the confidence level of the test. 0 means “definitely greyscale”, 1 means “probably colour” (i.e. “an image or shading was seen that potentially contains non neutral colours”), and 2 means “definitely colour”.
If the caller wishes to spend the CPU cycles to get a definite answer, options can be set to FZ_TEXT_OPT_IMAGES | FZ_TEXT_OPT_SHADINGS and images and shadings will be exhaustively checked.
As an optimisation, given how much faster is is to check non-images and shadings, it can be worth running the device once without the options set, and then only running it again with them set if required.
If the device is run with passthrough as NULL, then as soon as it encounters a “definite” non-neutral colour, it will throw a FZ_ABORT error. This can save a considerable amount of time, as it avoids the interpreter needing to run through an entire page when observation of one of the very first graphical operations is enough to know that colour is being used.
As discussed above, the envisaged use case for this device is to detect whether page contents require colour or not to allow printers to decide whether to rasterise for colour inks or a faster/cheaper greyscale pass.
Such printers will normally be operating in banded mode, which requires (or at least greatly benefits from) the use of a display list. By using the device in passthrough mode, the testing can be performed at the same time as the list is built.
Simply create the display list device as you would normally, and pass it into fz_new_test_device as passthrough. Then run the page contents through the returned test device. The test device will pass each call through to the underlying list device and so the display list be built as normal.
When run in this mode, the device can no longer use the ‘early-exit’ optimisation of throwing a FZ_ABORT error.
The Trace device is a simple debugging device that allows an XML-like representation of the device calls made to be output.
This is a useful tool to visualise the contents of display lists.