17.1 Overview

Fonts are represented in MuPDF by the abstract fz_font type. This reference counted structure, encapsulates the basic information about a font, specifically:

Glyph list
Each font consists of a list of glyphs.
Glyph data
How to draw each glyph. In traditional fonts this information is known as the ‘Outline data’ (or ‘Outlines’), but some font types (such as Type 3 fonts from PDF) can encapsulate other data, such as images and colors too.
Unicode map
Most (but not all) fonts contain information that enables glyphs to be mapped to/from the Unicode code points they represent. Without such information, it can be impossible to meaningfully extract text information from a document (such as for cut and paste).
Font BBox
All fonts include information for a bounding box that covers all the glyphs within a font. Sadly this can frequently be inaccurate or incorrect, so should be treated with distrust.
Glyph advances
All fonts contain simple Glyph advance information - how far to move the text cursor after having drawn a given glyph. This information ensures that successive characters are properly spaced w.r.t. each other.
Kerning data
Most fonts contain simple kerning data; this allows for the glyph advance between any 2 glyphs to be adjusted based upon particular glyph values. The classic example of kerning is noting that the spacing between A and the left hand edge of its following letter is typically different between AV and AN.
Shape data
Some fonts allow for the automatic ‘shaping’ of glyph sequences. The trivial example of this in western fonts is that the letters ‘f’ and ‘i’ can be combined into a single ligature glyph ‘fi’. For many non Latin scripts (especially Indic and south east Asian scripts), this procedure happens to a far greater extent. This can be as simple as the incorporation of diacritical marks, or as complex as the complete rearrangement or replacement of glyph sequences to give different appearances on the final rendered page. This process is know as ‘font shaping’ and the data required to perform this is font specific, and is optionally encapsulated within fonts themselves.

The fz_font does not include information about the particular size that a font is used at on the page, nor the basic colour used to render a font. It is therefore typical to see fz_fonts passed around the system paired with both a size (and/or transformation matrix), a colorspace, and color definition.

MuPDF uses Freetype to handle most of its font rendering. For Type 3 PDF fonts, it renders them itself. Font shaping is done using the HarfBuzz library.