Somehow-automatized creation of Font Awesome subset

Published:

Updated:

5 minute read

This website is build using Jekyll, with Minimal Mistakes as the base theme. To render some icons, such as the Mastodon symbol (), it is common in the today’s web to use custom fonts, such as Font Awesome. It is supposed to provide a compact, efficient and well-supported method for displaying monochromatic vector graphics.

The straw that broke the camels back

In a (relatively) recent release of Minimal Mistakes, the author switched from using CSS+Webfonts to JavaScript+SVG based method to use glyphs of Font Awesome. As I prefer to avoid javascript when it is not required, this annoyed me a bit at first glance. However, displaying these icons is not needed for a descent experience on my site and I believed that using JS to fetch only the required glyphs was a good way to reduce the size of resources loaded. After looking at the network details on my browser’s dev tools, I changed my mind:

  • the JS file probably contains all the glyphs, weighting 960 KB!
  • by default, the JS is served from the Font Awesome’s CDN1

These two reasons are enough to look closer at the problem in order to find a lightweight, no-javascript and self-hosted option.

Creating a subset of FontAwesome

To create a custom font similar to font awesome, two parts are required:

  • the font(s) themselves (.ttf, .woff or .woff2 files)
  • the piece of CSS which does the magic, basically by adding a ::before content to any item of class fa-*

Subsetting fonts

As my website uses only a few glyphs, it annoys me to send hundreds of kilobytes of fonts and CSS of which only a small part is actually useful. This is why I started looking for methods to extract a subset of a font.

In fact, there are many websites to do that, with some of them being free such as fontello. However, I prefer to be able to do that directly with some local scripts, instead of depending on an online service which is not handy to automatize (GUI selection of glyphs…).

I found a blog post from @morsetree, which provides a good overview of the process. He is using pyftsubset, from the Python fonttools, which is available as a package on my distribution but also easy to install using pip for instance.

This fonttools really lacks of documentation, but seems pretty powerful. It provides both a Python library and a set of command-line front ends for some high level operations. Here is an example usage of pyftsubset, which extracts the glyphs associated to unicode characters F004 and F005 from font awesome. A font file subset.ttf is created with only these two symbols.

pyftsubset <path/to/fontawesome-webfont.ttf> --unicodes=U+f004,U+f005 \
           --output-file=subset.ttf

Magic gluing CSS

TODO: Add a link to the _scss/fontawesome_min.scss file!

The FontAwesome CSS file contains a generic part (the @font-face directives to use a custom font and a few styling to display correctly the symbols) and a symbol specific part (the unicode content of each corresponding glyph). For each glyph, the CSS looks to the following:

.fa-mastodon:before {
  content: "\f4f6"; }

To use a subset, the whole generic part is required, but all unused symbol may be removed from the second part. As I am using Sass, I used the SCSS code from the FontAwesome distribution to create a partial file with all the generic content. It imports another partial file, containing the specific style for each desired symbol. Transposing it to pure CSS should be easy: just look for the generic part in FontAwesome CSS files.

Automatizing the process

Based on this tool, it could be great to have an automated process which discover the required unicode glyphs and extract them from the different font awesome files.

Find the symbols used in the Jekyll build

The basic idea for listing all the symbols used via font awesome is simply to grep the HTML, CSS and JS files in order to find possible usage of a font awesome glyph. It could be done directly on the source files of Jekyll, but it would ignore many edge case, such as symbols used in an external Jekyll theme. In my use case, it was preferable to use the Jekyll build instead (which involves to build the site before to update the font resources and to build it again). This is not ideal, but enough for me.

TODO: is it useful to look in the CSS files??? No use case in mind, except ugly tricks…

Additional precautions should be taken to avoid considering the generated CSS/SCSS files.

TODO

find "${SITE_PATH}" \( -name '*.html' -o -name '*.css' -o -name '*.js' \) \
               -not -path "*/${FONTAWESOME_PATH}/*" \
               -not -path "*/${BASENAME}.css" \
               -not -path "*/${BASENAME}.scss" \
               -exec grep -E -o 'fa-[[:alnum:]-]+' '{}' ';' \
              | sort -u

Get the corresponding unicode glyphs

TODO: example of false-positive fa-xxx

With the name of all (possible) symbols, the next step is to find the unicode value associated to them in FontAwesome. Here we exploit the template used in the symbol-specific part of FontAwesome CSS to get the value from the content: attribute. Of course, some of the fa-xxx detected in the previous part are not really symbol names… Such false-positive are simply ignored at this step, removing them from our subset content.

function extract_glyphs() {
    while read curfa
    do
        glyph=$(grep -A 1 "\.${curfa}:before" "${fa_fullcss}" | grep -o '"\\.*"' | sed 's/"\\\(.*\)"/\1/')
        [[ ! -z "${glyph}" ]] && echo -n "U+$glyph,"
    done
}

Creating the shortened CSS file

TODO

Generating the fonts in TrueType and WOFF2

Final script and limitations

Fork Awesome

During the write of this post, I discovered Fork Awesome, a community-maintained fork of Font Awesome which started after some changes in the direction of the project. It seems alive and more focus on simplicity than the v5 of Font Awesome. As they still provide all the glyphs in a single font file, it makes the script way simpler (no merge required). I will try to adapt my script and update this post to use Fork Awesome.

  1. In addition, there is no resource integrity checking in the script tag as the time of writing, so you have to pray for their CDN to stay in good hands… I opened an issue to fix that for all the other users.