Somehow-automatized creation of Font Awesome subset

Published: October 29, 2018

Updated: May 05, 2023

7 minute read

This website is built using Jekyll, with Minimal Mistakes as the base theme. To render some icons, such as the Mastodon symbol (), it is common in the today’s web to use custom fonts, such as Font Awesome. It is supposed to provide a compact, efficient and well-supported method for displaying monochromatic vector graphics.

At this point, some of you can already argue about the added weight and complexity for such non-essential “feature”: for a small page with a few kilobytes of HTML and containing 2 icons, the browser has to download a font of some hundreds of kilobytes plus as much of CSS and to parse everything. So, let’s admit in the rest of this post that we want to display some of these icons…

The straw that broke the camel’s back

In a (relatively) recent release of Minimal Mistakes, the author switched from using CSS+Webfonts to JavaScript+SVG based method to use glyphs of Font Awesome. As I prefer to avoid javascript when it is not required, this annoyed me a bit at first glance. However, displaying these icons is not needed for a descent experience on my site and I believed that using JS to fetch only the required glyphs was a good way to reduce the size of resources loaded. After looking at the network details on my browser’s dev tools, I changed my mind:

the JS file probably contains all the glyphs, weighting 960 KB!
by default, the JS is served by the Font Awesome’s CDN¹

These two reasons are enough to look closer at the problem in order to find a lightweight, no-javascript and self-hosted option. The simplest way to do that is by simply downloading the assets and serving them statically (which was the previous way to do that in Minimal Mistakes)

Creating a subset of FontAwesome

To create a custom font similar to font awesome, two parts are required:

the font(s) themselves (.ttf, .woff or .woff2 files)
the piece of CSS which does the magic, basically by adding a ::before content to any item of class fa-*

Subsetting fonts

As my website uses only a few glyphs, it annoys me to send hundreds of kilobytes of fonts and CSS of which only a tiny fraction is actually useful. This is why I started looking for methods to extract a subset of a font.

In fact, there are many websites to do that, with some of them being free such as fontello. However, I prefer to be able to do that directly with some local scripts, instead of depending on an online service. While this web service by itself is not handy to automatize (GUI selection of glyphs…), they provide in their git repository some tools to perform these actions from command line. But all these tools are working by doing HTTP requests on their instance, making a very hard dependency over them…

I found a blog post from @morsetree, which provides a good overview of the process. He is using pyftsubset, from the Python fonttools, which is available as a package on my distribution but also easy to install using pip for instance.

This fonttools really lacks of documentation, but seems pretty powerful. It provides both a Python library and a set of command-line front ends for some high level operations. Here is an example usage of pyftsubset, which extracts the glyphs associated to unicode characters F004 and F005 from font awesome. A font file subset.ttf is created with only these two symbols.

pyftsubset <path/to/fontawesome-webfont.ttf> --unicodes=U+f004,U+f005 \
           --output-file=subset.ttf

Magic gluing CSS

The FontAwesome CSS file contains a generic part (the @font-face directives to use a custom font and a few styling to display correctly the symbols) and a symbol specific part (the unicode content of each corresponding glyph). For each glyph, the CSS looks like the following:

.fa-mastodon:before {
  content: "\f4f6"; }

To use a subset, the whole generic part is required, but all unused symbol may be removed from the second part. As I am using Sass, I used the SCSS code from the FontAwesome distribution to create a partial file with all the generic content. It imports another partial file, containing the specific style for each desired symbol. Transposing it to pure CSS should be easy: just look for the generic part in FontAwesome CSS files.

Automatizing the process

Based on this tool, it could be great to have an automated process which discover the required unicode glyphs and extract them from the different font awesome files. Hence I did a -quite dirty- shell script to do the hard work.

Find the symbols used in the Jekyll build

The basic idea for listing all the symbols used via font awesome is simply to grep the HTML, CSS and JS files in order to find possible usage of a font awesome glyph. It could be done directly on the source files of Jekyll, but it would ignore many edge case, such as symbols used in an external Jekyll theme. In my use case, it was preferable to use the Jekyll built files instead. This involves to build the site before to update the font resources, run the script and to build it a second time. This is not ideal, but far enough for now.

Additional precautions should be taken to avoid considering the generated CSS/SCSS files. The final command to retrieve all the unique glyph names used is this ugly code:

find "${SITE_PATH}" \( -name '*.html' -o -name '*.css' -o -name '*.js' \) \
               -not -path "*/${FONTAWESOME_PATH}/*" \
               -not -path "*/${BASENAME}.css" \
               -not -path "*/${BASENAME}.scss" \
               -exec grep -E -o 'fa-[[:alnum:]-]+' '{}' ';' \
              | sort -u

Get the corresponding unicode glyphs

With the name of all (possible) symbols, the next step is to find the unicode value associated to them in FontAwesome. Here we exploit the template used in the symbol-specific part of FontAwesome CSS to get the value from the content: attribute. Of course, some of the fa-xxx detected in the previous part are not really symbol names… Such false-positive are simply ignored at this step, removing them from our subset content.

function extract_glyphs() {
    while read curfa
    do
        glyph=$(grep -A 1 "\.${curfa}:before" "${fa_fullcss}" | grep -o '"\\.*"' | sed 's/"\\\(.*\)"/\1/')
        [[ ! -z "${glyph}" ]] && echo -n "U+$glyph,"
    done
}

Creating the shortened CSS file

We now have a list of all the used names. First we need to generate the CSS file with each name associated to its “before” content and associated glyph. A few lines of shell do the trick, allowing for some duplication:

while read curfa
do
    glyph=$(grep -A 1 "\.${curfa}:before" "${fa_fullcss}" | grep -o '"\\.*"' | sed 's/"\\\(.*\)"/\1/')
    if [[ ! -z "${glyph}" ]]
    then
        echo ".${curfa}:before {
content: \"\\${glyph}\"; }
"
    fi
done

Generating the fonts in TrueType and WOFF2

The creation of the font files require 3 steps:

subsetting each font file (brands, regular and solid) with discovered glyphs, using pyftsubset
merging some of the resulting files using pyftmerge (I merge brands and regular as there is no overlap of glyph values, but keep solid a separated font)
compress the TTF into a lighter WOFF2 font using woff2_compress

Final script and limitations

If you are curious, you can check the current shell script on the git repository of my blog. Just be prepared: it is not exactly what we can call “beautiful code”. Use it if you want or take any part of it for other stupid ideas !

In its current state, the main issues are:

it requires an intermediate step (first jekyll build before to run it on the build output)
it will not remove a glyph that stopped being used, because it scan for all CSS/SCSS code and therefore will detect the glyphs it placed itself on a previous run…

Fork Awesome

During the write of this post, I discovered Fork Awesome, a community-maintained fork of Font Awesome which started after some changes in the direction of the project. It seems alive and more focus on simplicity than the v5 of Font Awesome. As they still provide all the glyphs in a single font file, it makes the script way simpler (no merge required). I will try to adapt my script and update this post to use Fork Awesome.

In addition, there is no resource integrity checking in the script tag as the time of writing, so you have to pray for their CDN to stay in good hands… I opened an issue to fix that for all the other users. ↩