The Fool on the Hill: Reducing the number of colours on a website

By: Simon Brooke :: 19 April 2026

It seems I don't write technical notes about the web much these days.

I'm working with a popular documentation generator which will remain nameless to prevent embarrassing its maintainers. Its standard stylesheet uses one hundred and ten different colour specifications. I wanted to produce 'dark theme' and 'light theme' stylesheet variants, so that I can have my documentation sensitive to the user's preferences in exactly the same way this blog is.

How do you even start to retheme a stylesheet with one hundred and ten different colour specifications? Let me show the ways.

First, extract all the different colour specifications:

$ grep 'color:' default.css |\
	grep -v var |\
    sed 's/^.*://' | sort | uniq > colors.css

And count them...

$ wc -l colors.css 
110 colors.css

Then, write an awk script which will cut the colour space down from 16,777,216 potential different colours to 4,096 — of course 4,096 is still vastly too many, but we've now cut down the problem by four orders of magnitude. Have that awk script generate a sed script which will auto-edit the original file:

#!/bin/awk -f

# Remove the eight least significant bits of each colour component
function compress_colour( data) {
    return and( data, 0xf0f0f0);
}

/0x[0-9A-F]{6}/ { printf( "s/%s/0x%6.6x/\n", $1,
			  compress_colour( strtonum( $1))); }

And run it:

$ cat colors.css | awk -f compress-colours.awk > compress-colours.sed

This generates a sed script whose first few lines look like this:

s/0x000000/0x000000/
s/0x002080/0x002080/
s/0x006318/0x006010/

Which is obviously wrong, bother. It has hex numbers in the format accepted by C (and awk), not CSS colour codes. OK, quick sed fix:

$ cat colors.css |\
	awk -f compress-colours.awk |\
    sed 's/0x/#/g' > compress-colours.sed

And this time it's right:

s/#000000/#000000/
s/#002080/#002080/
s/#006318/#006010/

Obviously it would have been more efficient not to generate the lines where the left hand pattern and the right hand pattern are identical, and I could go back and fix the awk script; but life is short. Obviously, also, this hasn't dealt with the colours which have been specified with colour names and those specified with rgba tuples, but fortunately in this case there are few of those.

Oh, damn it. Right, there's a lookup table of colour names to hex codes here. If I had been a real hacker I could probably have done this in a one liner, but I'm not so I needed another auxiliary awk script:

#!/bin/awk -f

function ltrim(s) { sub(/^[ \t\r\n]+/, "", s); return s }
function rtrim(s) { sub(/[ \t\r\n]+$/, "", s); return s }
function trim(s)  { return rtrim(ltrim(s)); }

$3 ~ /[A-Za-z]+/ { printf( "s/ %s *;/ #%s;/\n", trim( $3), trim( $4)); }

Having written this, I could do

$ wget -O - https://htmlcolorcodes.com/color-names/ |\
    pandoc -f html -t markdown |\
    grep '^|' | sort | uniq |\
    tee lookup.md |\
    awk -F\| -f gen-lookup.awk |\
    tee lookup.sed

...which generates a new sed script which, when applied to the original document, converts the named colours into hex colours which we can then compress as we did before.

But the sed script isn't perfect: pandoc, in converting HTML to Markdown, split some very long names (e.g. LightGoldenrodYellow, MediumSlateGray) in awkward places, and that required some hand editing, But worse, if Red came in the file before MediumVioletRed, the colour codes got mangled. Fortunately all we needed was to sort by line length, so a quick

$ cat lookup.md |\
	grep '^|' | sort | uniq |\
    awk -F\| -f gen-lookup.awk |\
    tee lookup.tmp |\
    awk '{ print length, $0 }' |\
    sort -rn |\
    cut -d" " -f2- > lookup.sed

fixed that.

Finally, the authors of the stylesheet had not been consistent with their capitalisation; so it had both White and white, and similarly for many other colours.

cat lookup.sed | tr '[A-Z]' '[a-z]' > ll.sed # create a copy file all lower case
cat ll.sed >> lookup.sed # append it to the original
cat lookup.sed | awk '{ print length, $0 }' |\
	sort -rn |\
    cut -d" " -f2- > ll.sed # and sort again for length
mv ll.sed lookup.sed

And at last, we can now regenerate that compress-colours.sed script:

cat customdoxygen.css |\
    sed -f lookup.sed |\
    tee hexcolours.css |\
    grep 'color:' |\
    grep -v var |\
    sed 's/^.*://' | sort | uniq |\
    tee colours.css |\
    sed 's/#/0x/g' |\
    sed 's/;//' |\
    tee colours.hex |\
    awk -f compress-colours.awk |\
    sed 's/0x/#/g' > compress-colours.sed

Right, let's generate a new stylesheet!

$ cat default.css | sed -f compress-colours.sed > compressed.css

How many colours has this?

$ cat compressed.css | grep 'color:' |\
	grep -v var |\
	sed 's/^.*://' |\
	tr '[a-z]' '[A-Z]' |\
	sort |\
	uniq | wc -l
61

Still 61, so nothing like good enough.

I need to think about better colour compression algorithms.

I'll need to visit this again later.

However, what I've reminded myself of today is how powerful the original UNIX toolkit is. Yes, this is engineering of the early 1960s, just as Lisp is engineering of the early 1960s. But it's damned good engineering; simple, composable, flexible. And that's why it's still used.

Not all the engineering of the early 1960s is as good, of course; but to get into software systems design in those early days of computing, you had to be very bright indeed. And it shows.

Key posts

Recent Posts

Tags

Colours

The Fool on the Hill: Reducing the number of colours on a website