25.5 Language
Some formats include a declaration of which language is being used for a given piece
of text. This can be used to influence aspects of the text layout, including the exact
choice of glyphs used in a given font. While we make relatively little use of this at
present, we try to preserve the information as part of our philosophy of not losing
any information unnecessarily.
Accordingly, we use ISO 639 language specification strings, for example:
typedef enum fz_text_language_e { FZ_LANG_UNSET = 0, FZ_LANG_ur = FZ_LANG_TAG2(’u’,’r’), FZ_LANG_urd = FZ_LANG_TAG3(’u’,’r’,’d’), FZ_LANG_ko = FZ_LANG_TAG2(’k’,’o’), FZ_LANG_ja = FZ_LANG_TAG2(’j’,’a’), FZ_LANG_zh = FZ_LANG_TAG2(’z’,’h’), FZ_LANG_zh_Hans = FZ_LANG_TAG3(’z’,’h’,’s’), FZ_LANG_zh_Hant = FZ_LANG_TAG3(’z’,’h’,’t’), } fz_text_language;
To save space we pack these into 15 bits. Accordingly, we provide a way to
pack/unpack these to/from the more normal string representations:
/* Convert ISO 639 (639-{1,2,3,5}) language specification strings losslessly to a 15 bit fz_text_language code. No validation is carried out. Obviously invalid (out of spec) codes will be mapped to FZ_LANG_UNSET, but well-formed (but undefined) codes will be blithely accepted. */ fz_text_language fz_text_language_from_string(const char *str);
/* Recover ISO 639 (639-{1,2,3,5}) language specification strings losslessly from a 15 bit fz_text_language code. No validation is carried out. See note above. */ char *fz_string_from_text_language(char str[8], fz_text_language lang);