[gs-devel] Using OpenType (CFF-flavour) fonts

Graham Douglas graham.douglas at readytext.co.uk
Mon Dec 13 18:59:27 UTC 2010


Hi Ken

Just got home from work and read your reply. Many, many
thanks for taking time to reply in so much detail, it is
genuinely much appreciated. I will read through to digest
it all.

Very best wishes

Graham


------------------------------------------------
 >
 > Hi Graham,
 >
 > apologies if some of the following is too simplistic, I'm just trying to
 > cover everything and I'm not certain where to pitch it. Please just skip
 > over anything which is too simple.
 >
 >
 > At 15:20 12/12/2010 +0000, Graham Douglas wrote:
 >
 >> I would like to use some OpenType (CFF-flavour) fonts at my disposal
 >> but need to understand a little more about the process, especially
 >> accessing some of the glyphs which do not have designated
 >> Unicode code points --- such as small caps and oldstyle numbers, I 
guess
 >> because they are just design variations as opposed to being "real
 >> characters" of a language.
 >
 > Its not clear to me what you mean by 'use' the fonts. The easiest way to
 > use them is to have the application embed the TrueType fonts as type 42
 > fonts in the PostScript program. But then I'm not sure how you are using
 > GS, possibly not by sending PostScript ?
 >
 > If you want to use them as replacements for some other font requested 
in a
 > PostScript program then there is the possibility that this will not 
work as
 > expected, as this is a Ghostscript extension, not present in the 
PostScript
 > specification, so there are no rules on how it should work.
 >
 > Unicode is not really relevant to PostScript, or in many ways to 
TrueType
 > fonts. TrueType glyphs are accessed by Glyph ID (GID), the font *may* or
 > may not contain a Unicode CMAP table which maps Unicode code points to
 > glyph IDs, but since PostScript doesn't use Unicode this isn't terribly
 > relevant.
 >
 > PostScript uses a 'character code' which for simple fonts is a single 
byte
 > index into an Encoding array, the array contains a glyph name for each
 > entry. The glyph programs are contained in a PostScript dictionary, and
 > referenced by name. So the character code is looked up in the 
Encoding to
 > get a name, then the name is looked up in the CharStrings dictionary 
to get
 > a program, which is then executed.
 >
 > As you can see there is no Unicode in the PostScript at all. Note that
 > Encoding arrays are 256 entries, so referenced by a single byte.
 >
 > PostScript also includes type 0 fonts (Original Composite Fonts) and
 > CIDFonts, both of which are potentially referenced using multiple 
bytes for
 > the character code. It is possible to apply a Unicode CMap (that's 
not the
 > TrueType CMAP table, but a PostScript construct which is quite 
different)
 > to a CIDFont which will allow you to access the glyphs using Unicode 
values.
 >
 > However, that's rather co-incidental, just as its possible to 
construct a
 > regular Encoding which uses ASCII values to map to the named glyphs 
which
 > represent each of the ASCII code points. The PostScript interpreter 
neither
 > knows nor cares.
 >
 >
 > In general TrueType fonts in PostScript are handled by conversion to 
type
 > 42 fonts, or CIDFonts with type 42 outlines. For OpenType fonts with CFF
 > outlines its also possible to convert into CFF fonts, or CIDFonts 
with CFF
 > outlines. This is normally done by the creating application and the font
 > embedded into the PostScript stream.
 >
 > Now in addition to all that, Ghsostcript can use TrueType font 
directly (as
 > mentioned this is a non-standard extension, PostScript does not handle
 > TrueType fonts). Ths can be done in two ways, either as a regular 
font, or
 > as a CIDFont, depending on whether you load the font in Fontmap.GS or 
cidfmap.
 >
 > When loaded as a regular font GS will attempt to use the CMAP and POST
 > subtables in the TrueType font to build a reasonable Encoding for the 
font.
 > If I remember correctly we prefer the 3,1 CMAP subtable for this. I'm 
not
 > certain what happens if that subtable is not present. The Encoding is 
used
 > to map the character code to a name, then on to a GID which is used to
 > access the glyph.
 >
 > When loaded as a CIDFont things are of course somewhat more complicated.
 > You don't normally use a CIDFont 'as-is', the only operation which works
 > with a naked CIDFont is glyphshow, in this case it takes a CID rather 
than
 > a name. Normally the CIDFont is composed with a CMap in order to 
produce a
 > CID-keyed instance of the font. This maps a character code to a CID and
 > things proceed more or less as for a regular font. The difference is 
that
 > the character code can consist of more than one byte (indeed can be of
 > variable length).
 >
 > But we can't access a TrueType font using a CID, so we need to map the
 > character code to a CID, then map the CID to a GID so that we can access
 > the glyphs in the font. This is complicated, and obviously somewhat 
prone
 > to error, since we are trying to replace a PostScript font with a
 > non-PostScript font. There's quite a bit of internal jiggery-pokery
 > involved with the various TrueType tables, and CIDSystemInfo 
dictionary to
 > try and build the two step mapping we need for this.
 >
 > However, the simplest way to use it is to declare the CIDFont as 
having an
 > Identity CIDSystemInfo and then compose that CIDFont with an Identity 
CMap.
 > Then you can use the GID as the character code. This effectively 
works out
 > as a one-to-one mapping so that the character code ends up as the GID 
used
 > to access the glyph from the font. Since you can get the GID, this 
may be
 > the easiest way to work.
 >
 > Eg in cidfmap:
 >
 > /TimesNewRoman << /CSI [(Identity) 0] /Path (c:/windows/fonts/times.ttf)
 > /SubfontID 0 /FileType /TrueType >> ;
 >
 > Note that we've used the 'Identity' Ordering in the CIDSystemInfo (CSI).
 >
 > Then to use the font :
 > /TimesNewRoman-Identity-H findfont
 >
 > Note that this uses the CIDFont 'TimesnewRoman' and the CMap 
'Identity-H',
 > if GS can't find a named font then one of the things it will do is 
try and
 > identify a CIDFont-CMap combination. If it thinks there is a possible
 > combination then it will see if there are appropriately named CIDFont 
and
 > CMap available. If so, then it will automatically compose the two 
together
 > to produce the CID-keyed instance for you.
 >
 >
 >> Where I'm kinda lost is making the "bridge" between all that
 >> font info and the PostScript machinery/code I need to write in order to
 >> access all the glyphs in the font, especially the small caps and 
oldstyle
 >> numbers. I know you can use glyphshow to draw any glyph in the
 >> CharStrings dictionary, encoded or not, but that seems a complicated
 >> way to go about things, maybe --- perhaps not?
 >
 > I'm not absolutely certain (I'd have to go and read the code), but I 
don't
 > think this will work with a TrueType font loaded as a PostScript 
font. The
 > way that the font is loaded I'm not certain that glyphs which are not
 > present in the Encoding have any way to map the name to a GID. Since you
 > can't access TrueType glyphs by name, that would mean you can't use
 > un-encoded glyphs.
 >
 >
 >> So, in summary, what else do I need to read or
 >> understand to fully utilise OpenType CFF-flavour
 >> fonts with GS in order to access all the various
 >> glyphs in the font.
 >
 > Ideally you should read the font yourself, convert into a type 42, or
 > CIDFont with type 42 outlines, and embed the font in the PostScript 
stream.
 > This is probably the only 100% reliable method.
 >
 >
 >> --- Is it a matter of reencoding the font?
 >> --- Or, is the CIDFont machinery something that
 >> I need to understand?
 >
 > It may be that loading the font with an Identity-H CMap (and 
declaring it
 > with Identity Ordering in CIDSystemInfo in cidfmap) would be the easiest
 > way for you to proceed, though this does require that you know the 
GID for
 > the glyphs you want to use.
 >
 >
 >> Is CIDFont machinery relevant to this situation
 >> --- I have the Adobe specs but not yet fully
 >> read or absorbed them.
 >
 > I doubt that you really want to use the full power of the CIDFont and 
CMap
 > mechanism. That would be more appropriate if you were going to embed the
 > font as a CIDFont, especially if you wanted to access more than 256 
glyphs.
 > That mechanism is really present for languages which need more than 256
 > glyphs at a time (eg Chinese, Japanese, Korean, Vietnamese), though 
recent
 > Adobe products make extensive use of it even for Latin fonts.
 >
 >
 >
 >                      Ken
 >
 >
 >
 > ------------------------------
 >
 > _______________________________________________
 > gs-devel mailing list
 > gs-devel at ghostscript.com
 > http://ghostscript.com/cgi-bin/mailman/listinfo/gs-devel
 >
 > End of gs-devel Digest, Vol 13, Issue 5
 > ***************************************
 >



More information about the gs-devel mailing list