Thursday, October 22, 2015

Hacking TrueType Fonts For Character Information

Those of you who have ever been curious about making your own font should know that doing so on the computer isn't easy.  Sure there are several good programs out there that can help you take your design and digitize it, but a well-made font has been crafted with much care and attention to detail by a computer scientist just as much as a designer.  Some considerations that need to be made on the technical side include, for instance, how to "hint" rendering at very large or small sizes, accounting for grayscale devices in such hinting, making characters by compositing glyphs to save on file size (e.g. fi = f + i), and dealing with different platforms and character encodings across different computer systems so the font can be portable across Windows, Mac, and others.

Now, think back to one of my long-time projects that relates to displaying text and images.  Yes, BriteBlox can certainly be capable of displaying messages set with TrueType fonts, and this has been supported in the development version for quite some time.  However, to make it scale well for any message at all, it is important to know what the width of each character is.  As such, the efforts described here were undertaken for the sake of improving BriteBlox.


The simplest way to render TTFs in Python is to use PIL (the Python Imaging Library).  With this, you can establish an Image object and then instruct PIL to render text with the desired typeface onto the image.  However, you need to know in advance what the width of each character is so you can make the correct-sized Image object before rendering text onto it only to discover that either it's too short and text is chopped, or you're out of memory.  In the BriteBlox PC Tools, this feature was disabled in releases for such a long time because I would manually have to guess and check the correct size for the bounding box for my text.  Soon, that will no longer be required!

The High-Level Solution

[Important note] There may be, in fact, a better solution for those of you using Qt, an application framework.  Unfortunately, my implementation of the Qt 5 libraries in PyQt5 seg-faults (or tries to access a null pointer) when I try to run the appropriate commands, so I will have to write about that in the future once I upgrade Qt and hopefully get it working.

Along with PIL (or Pillow) in Python, you can use the fonttools and ttfquery libraries (which depend on numpy) in order to fetch the width of a particular character glyph.  (The glyph is the artistic rendering of the character; the character is more of just a concept in the realm of typography.)  To get the required width (and height) for the container image, begin by using this code:

from ttfquery import describe, glyphquery
myfont = describe.openFont("C:\\Windows\\Fonts\\arial.ttf")

glyphquery.width(myfont, 'W')

Now you have the width of a character from your TTF file.  If you actually run this, though, you may notice the values seem really odd -- in fact, very large.  This is because the values being retrieved (I'll tell you exactly where these come from later) are scaled to "font units" or "EM units", which relate to the "em square".  Remember your em-dash and en-dashes from English class?  Well, turns out they're incredibly important in typography too.  The EM units are derived from the "EM square", which is a square the size of the em-dash.  Back when fonts were cast into metal stamps and then pressed into paper, the em-dash was typically the widest character you could have.  In digital media, though, characters are allowed to be wider than the em-dash, so you have to look at each character specifically to find out how wide each one is.  Nothing can be taken for granted.

EM units are simply little divisions of the EM square such that now the EM square is divided up into a grid.  There are several acceptable values for how many units exist along one single side -- in fact, any value (or power of two?) from 16 to 16384 is acceptable.  The typical "resolution" of the EM square, as defined by the "unitsPerEm" field in the TTF specification, is 2,048 units per side of the square.  However, again, this value cannot be taken for granted; I will explain ways to fetch it later.  Once you have the correct unitsPerEm value, put it into the following equations:

pixel_size = point_size * resolution / 72
pixel_coordinate = grid_coordinate * pixel_size / EM_size


Remember that fonts are generally measured in points rather than pixels, a tradition that dates back to at least the 1700s.  Nowadays, a point is defined as 1/72 inch, thus the ratio of point_size / 72 in the first equation.  Now, you need to get rid of the "inch" in the unit by multiplying by some unit that is 1/inch (remember dimensional analysis from chemistry or physics?).  The perfect unit for this happens to be pixels per inch, which is defined differently on different computing platforms.  For instance, Microsoft typically defines an inch as 96 pixels in Windows, thus as monitors are made with ever-higher resolution, the distance on the monitor representing a physical inch gets noticeably smaller.  Now, if you consider the right edge of your glyph to be the grid coordinate of interest, you can finish off the equation.  Let's see how this would work for the capital letter "W" at size 12 point:

>>> glyphquery.width(myfont, 'W')
>>> 1933.0 * 12 * (96.0/72.0) / 2048
And now at 24 point:
>>> 1933.0 * 24 * (96.0/72.0) / 2048

IMPORTANT NOTE: To avoid rounding error, you must make special amendments to get Python to treat your numbers as floating-point values rather than integer values.  You can do this by simply adding ".0" to the end of an integer value, and the answer will automatically be "promoted" to the most detailed data type.  If I were to leave the first equation alone and simply write 1933 * 12 * (96/72) / 2048, I would get the answer 11 which is definitely wrong, as my empirical observation of the character "W" indicates that it needs at least 13 pixels of width at 12 point size, even with anti-aliasing turned off.

Finding the EM Size Of Your Font

To get the correct value for unitsPerEm (a.k.a. EM_size in the equation), there are some nice tools you can go search for. offers some nice suggestions, including SIL ViewGlyph for Windows.  Simply open the font file, go to View -> Statistics, then look for "units per Em".

If you have a hex editor handy, open your font file in the hex editor.  Toward the beginning of the file, look for the four characters "head" in plain ASCII (0x68 0x65 0x61 0x64).  Skip four bytes after this (the checksum of the table), and you will get to the table's starting address as indicated by the hex values (e.g. my version of Arial indicates the HEAD table offset is 0x00 0x00 0x01 0x8C, thus 0x18C).  Navigate to (this position in the file + 18 more bytes), and the next two bytes (representing an unsigned short integer, from 0 to 65535) are your unitsPerEm value.  Remember this value is typically 2048, or 0x800.

Trust Me, This Is Correct

I spent long enough simply trying to find out in the spec where this magical "EM_size" parameter could be found.  After spending days poring over the Apple TrueType Reference Manual and Microsoft TrueType documentation (warning: .DOC file), it finally became apparent.  This was just an exercise in being comprehensive, though, as Arial obviously had a unitsPerEm value of 2048.

Because I originally didn't know that Microsoft used a standard of 96 PPI rather than 72 PPI, my initial calculations in the formulas above always seemed wrong (too small).  I set out to find out another way to this data, so I read the TTF spec as well as some supporting documentation (including this page and the source of the equations listed above), and set out to find the bounding boxes (bbox) for each glyph, as defined by the xMin, yMin, xMax, and yMax values for each glyph in the GLYF table.  This proved to be unsatisfactory because they don't really tell you how to parse the GLYF table.
  • The raw data seems to just launch right into the 1st glyph without any nice header info as to what glyph(s) belongs to what character, or how many bytes define each glyph in advance.
  • The data I gleaned for the first glyph (which I don't even know what it is) seemed out of whack, with a total height of slightly over the EM size and a total width of almost 3 times the EM size!
I was leery of those results, and decided to take another route.  The "OS/2" table (its header is literally thus in the font file data) contains properties such as sTypoAscender, sTypoDescender, and sTypoLineGap.  Despite that OS/2 is used by Microsoft devices only, the values it contains should be platform-agnostic.  However, comparing my Arial font file to the documentation I had, something seemed fishy.  Maybe its OS/2 table is older and doesn't contain as much information, but because these three fields are so far down the table, I didn't want to take any chances with having counted incorrectly or misreading one of the data types.  I soon abandoned this idea too.

Yet another idea was to go to the CMAP table, which contains the mappings of characters to glyph indexes.  (I would have to sit and parse this table to figure out what the very first glyph is in GLYF, and there's no need for me to work backwards like that now.)  This table contains at least one sub-table (Arial has, in fact, three sub-tables here), so there is quite a lot of header data you need to go through before you get to the good stuff.  However, you still need to go through it carefully, otherwise you will be misled into meaningless data.  For Microsoft devices, you should look for the sub-table with the Platform ID of 3 and the Platform Encoding ID of 1.  After finding the byte offset to this table (which is relative to the start of CMAP, not just 0), I had to solve some equations in order to find what character (as defined by ASCII or compatible Unicode codes) mapped to which glyph.

I'm not going to go into the math here since it's described in the documentation, but I found out that in Arial, most printable characters we normally care about (specifically, those with ASCII codes between 0x20 and 0xFE) all exist sequentially and contiguously with glyph IDs ranging from 3 to 0x61.  The letters I cared about testing, the extreme-width cases of "W" and "i", happen to have glyph indices of 0x3A and 0x4C respectively, according to the algorithm.

With this information, it's time to scour the HMTX table for horizontal metrics.  The first thing in this table is an array of values pertaining to the advance width and left-side boundary of each glyph.  These values take two bytes apiece, thus from the beginning of the HMTX table, the offset to the glyph you care about is (glyph index * 4).  With the table at offset 0x268, the path to the letter W leads me down (0x3A * 4 = 0xE8) more bytes, to a total offset of 0x350.  Here, I quickly learn the advance width for the letter W is:


That's exactly what the Python program said with ttfquery & fonttools!

By this time, I had (only just, by sheer coincidence, auspicious timing, serendipity, or whatever you want to call such good fortune) discovered that Microsoft scales its PPI to 96 rather than the 72 I had originally expected.  After trying (and failing) to see if there was a particular DPI used with image objects generated by PIL, I simply stuck (96.0/72.0) into the equation and confirmed visually that the values seen here in the HMTX table are in fact the values you can use to calculate the width of a TrueType font on a Microsoft Windows system.

It remains to be seen how this'll perform on Macs.  I anticipate the PPI will need to be something different; perhaps it will in fact be 72 on that platform.  We'll see...

An Aside

In researching the equation of fi = f + i, I stumbled across the notion of ligatures.  "" is in fact a ligature, which was designed so that parts of the "f" and the "i" that run together would look coherent.  This brought me back to a time when I was very young and concerned with Evenflo products -- I am not a parent at this time, thus I was indeed a child last time I dealt with them.  They had a very odd and poorly-designed "fi" ligature on their trademarked logo that led me to believe it was some kind of weird-looking "A".  It confused me, since it seemed odd anyone would name their product "EvenAo", as it's awkward to say, and wondered what special significance that A had to be written so much differently and more fancifully than the other letters.  Just to jog your memory, here it is:

The Evenflo logo from when I was little

In my Google search, it seems apparent that they have adopted a new logo anyway, ditching an awkward ligature for something with nicer aesthetics overall and a modern vibe.  However, then another logo struck my fancy, especially with what turned up next to it:

Oh, how titillating.

Obviously having seen all these baby products, not to mention the mother with child, led me to believe the Tous Designer House logo was being quite suggestive.  As it turns out, the Tous logo is in fact a teddy bear.  Google, stop offering such awkward juxtapositions!

No comments:

Post a Comment