12.4 unicode-math — Using Unicode math fonts
The original motivation for developing TEX engines that understand Unicode and accept UTF-8 input was the desire to use text fonts with more than 256 addressable glyphs. While the engines also supported such bigger fonts for use in formulas, this was of little relevance, simply because TEX requires special attributes for fonts used in math, and no fonts, other than a few 8-bit fonts tailored for use with TEX, offered this. Furthermore, Unicode itself did not explicitly encode math symbols, except for very common ones also found in typical text fonts or that were a carry-over of 8-bit input encodings — i.e., keyboard layouts used across the world. Consequently, there was little incentive for devising a set of extended font encodings for mathematics; after all, there was no standard to follow, and any change would involve a huge effort. Thus, for a long time, all engines followed the traditional 8-bit font setup for typesetting as originally devised by Donald Knuth and only marginally changed since the eighties.1
This started to change when mathematical symbols finally got embraced by the Unicode Consortium and code points for all standard symbols in use with TEX and many more got defined. Once that had happened, the door of opportunity opened, and the first Unicode-encoded math fonts appeared not long after. Even though the TEX world was the driving force in this adoption into Unicode, the focus of the first such fonts was by no means TEX (but commercial systems such as Word); in fact, the new fonts initially missed crucial font parameters to make them usable with TEX. However, that changed too over time, and now there a dozen or more free and commercial Unicode Math fonts that you can use for typesetting with TEX.
If you have read Chapter 9, then you know that it is not enough to have fonts for math and TEX engines that can access them. You also need hundreds of definitions that make commands (such as or ) fetch the right glyph from the correct slot in the appropriate font and apply the TEX magic to make it become a binary, relational, or whatever math symbol. Obviously, in Unicode fonts the glyphs are stored in completely different places than in the 8-bit fonts to which the LaTEX math commands are tailored to, so nothing would work if you load such a font.
To say it differently: in the past making a new symbol font available for use with TEX meant (re)encoding the font so that it used the same slots as Computer Modern Symbol (cmsy), and then all LaTEX math commands automatically selected the right glyphs. This is what the packages provide that we discussed in the previous section.
Now, with Unicode-encoded fonts, the LaTEX commands have to change instead to make everything work with the new font setup, and this is what unicode-math by Will Robertson is undertaking for you, and which is the subject of this section.
As a first simple example: all you need in order to typeset using Unicode Math fonts is to load the package and use a Unicode engine. Without further adjustments this typesets your document in Latin Modern Math OpenType fonts.
So what do you gain by loading the package (after all, Latin Modern is the default font for LaTEX when using a Unicode engine)? The answer is that without making further adjustments (e.g., loading a different Unicode Math font), there is not much gain that it is immediately visible. You see that , , and are available without loading an extra package, but that is about all.1
But there are already now invisible but important differences between the output of the above example and that of the similar ones from the previous section. These become noticeable if you take the PDF of the examples and copy and paste parts of the formula into a different application (or into a new document). If you use Example 12- 4-1, then the α, for example, copies as the Unicode character U+1D6FC (Mathematical Italic Small Alpha), ̸= copies as U+2260 (Not Equal To), and the three different “A” characters copy as U+1D49C (Mathematical Script Capital A), U+1D504 (Mathematical Fraktur Capital A), and U+1D538 (Mathematical Double-Struck Capital A), respectively. That is, they all carry their mathematical meaning as part of their Unicode character information with them.
If on the other hand you started from Example 12-3-37, then the α is pasted as a Greek Small Alpha (a text character) or even as an “a”, the ̸= as /= or worse, and the different capital A’s as simple identical ASCII A’s; i.e., the material becomes unusable for further processing. For the question of accessible PDFs, this makes quite a difference.
12.4.1 Math alphabets revisited
Math alphabets as used in LaTEX were introduced in Section 9.4.1 on page →I 677 and further discussed in Section 12.1 in this chapter. In a nutshell, each is related to a font that has in its ASCII slot positions for A to Z (and sometimes a to z) glyphs that are suitable as special alphabetic letters in formulas, e.g., a calligraphic alphabet such as ABC. . . . These alphabets fall into two distinct groups. There are those where each glyph is always intended to be used by its own, i.e., , , , , and . The side-bearing of each glyph is specially adjusted to give them enough room when used as a single symbol in a formula, and that makes them often unsuitable and uneven looking, if you try to form words with them; e.g, while you could use to typeset CALLIGRAPH Y, it is clearly not what this alphabet was intended for.
The second group are those math alphabets that have a dual purpose. They can be used to typeset a single alphabetic character, e.g., ∀g ∈ G, but alternatively can also be used to properly typeset words or word fragments within a formula, e.g., Vmin. In this group we have , , , , and , and they all point to text fonts, which means that they provide proper kerning and ligatures if their argument consists of more than one character. The difference to the corresponding … commands (which can also be used in formulas) is that they do not change fonts based on surrounding conditions. For example, is usually used to build the textual operators, such as or , and it always produces the same roman letters in all formulas no matter what.
When the Unicode Consortium extended its support for mathematics, this dual rôle of some of the … alphabet commands in LaTEX turned out to be somewhat of a problem. In the Unicode block “Mathematical Alphanumeric Symbols” U+1D400 to U+1D7FF, Unicode places Latin and Greek letters as separate mathematical alphabets. The description in the Unicode standard for this block reads:
12.4.2 Adjusting the formula style
By default formulas typeset with TEX use an italic alphabet for variables denoted by Latin characters. Lowercase Greek letters are also in italic, but uppercase Greek is set upright. While this is a widely accepted style, it is by no means the only one in use. ISO recommends the use of italics throughout (i.e., uppercase Greek in italics too), and some people prefer to use upright for everything— the Euler fonts are a prominent example for this style; see Figure 12.40 on page 289. Finally, there is the French tradition to set everything upright except for lowercase Latin letters. The different styles can be set with the package key math-style as shown in Table 12.2; the default style is TeX.
Setting the style applies to the whole document. It is, however, still possible to set individual Greek letters explicitly upright or slanted by prefixing their command names with up or it, respectively, e.g., , , etc. Also possible is to use the command or to explicitly set a single Latin or Greek letter upright or italic, e.g., , , and so forth.
12.4.3 Setting up Unicode math fonts
Up until now we have loaded unicode-math, which gives us the features and commands described, but our documents are still typeset using Latin Modern, because that is LaTEX’s default in Unicode engines. If we want to use a different Unicode math font, we need to tell the package about it.
This declaration sets up the math font family to be used (which needs to be a specially prepared font). The command is modeled after the declarations provided by the fontspec package, e.g., , and you may want to review Section 9.6.1 on page →I 706 to familiarize yourself with the different ways to find and specify the family name and the various features that you can put into the feature-list.
There are a few keys that are specific to setting up math fonts, and those are discussed below. However, it is often enough to just give the right family name, as shown in the next example where we use Fira Sans and Fira Math for typesetting.