Every once in a while someone drops into #vim, complaining that colour scheme x doesn’t work, it shows all the wrong colours or a wrong background.

The cause to that is usually either a non-256color terminal or TERM indicates that it doesn’t support more than eight or 16 colours (e.g. by setting TERM in the shell init files, using a broken terminal).

This writeup is an explanation how terminal capabilities work with the colours as an example.

This can take a bit of reading, but two basic points on how emulators and applications communicate (escape codes and terminal capabilities) as well as how colours are represented (CLUTs) have to be explained to grasp how it works.

CLUT

CLUT stands for ‘colour lookup table’.

These tables are simple key-value stores that assign a colour to a specific number and used by emulators such as xterm or urxvt to look up which colour to signal to X (or recently wayland) after it was printed by an application.

Note: The representation of the number changes with the terminal used to display the output. A terminal emulator in X will use hex codes; a barebone unixoid terminal from the 90ies would send the 8bit sequence directly to the VGA card.

The simplest CLUT has only eight colours:

NUMBER NAME HEX VALUE
0 BLACK #000000
1 RED #FF0000
2 GREEN #008000
3 YELLOW #FFFF00
4 BLUE #0000FF
5 MAGENTA #FF00FF
6 CYAN #00FFFF
7 WHITE #FFFFFF

Note: The first computer capable of displaying colours was the IBM PC due to IBM’s Colour Graphics Adapter and could already display 16 colours.

At least eight colours are even on virtual terminals supported these days.

Most modern emulators also allow to define default foreground and background colours with the number 9, used to give the user more control without having to change the actual colours, e.g. if a user wants a light scheme the default background can be set to a shade of white or light grey and the default foreground to a dark grey, black or red without modifying what the colours in the CLUT really represent. However what ‘default fg/bg’ means is implementation defined, so it might as well set foreground to black and background to white.

Each emulator has a default CLUT that can differ from the values above. These are also often modifiable by the user, usually via the xresources or xdefaults file (see xrdb(1) and your emulators’ man page).

Escape codes and the CSI ESC[

ESC[ is a CSI, Control Sequence Initiator, used by applications to signal the emulator that a certain action has to be taken, such as changing a text attribute or moving the cursor.

\e is the ANSI C escape code for ESC (as well as the character sent by your emulator to the running application when you hit escape). Utilities such as echo and printf accepts those (and their octal and hexadecimal equivalents) as a string and transform them to characters the emulator understands.

These applications translate the strings representing the escape sequences (as well as their octal and hexadecimal representations) into characters the emulator understands with a switch-case statement:

// take the first char array from argv
char const *string = argv[0];
// define a char variable `chr` to work with
unsigned char chr;
// iterate with `chr` over `string`
while (chr = *string++) {

    // check if an escape code was signaled and that another character
    // is available
    if (chr = '\\' && *string) {

        // check each escape code and replace the string representation
        // with the actual character
        switch (chr = *string++) {
            case 'a': chr = '\a'; break;
            case 'e': chr = '\e'; break;
            case '0': chr = octal_to_char(*string); break;
            case 'x': chr = hexadecimal_to_char(*string); break;
            // X_to_char() is just a shorthand - this actually requires
            // to pull in and translate as many characters as needed
        }

    }

    // print the character
    putchar(chr);

}

As example, the following are equivalent:

ANSI C Octal Hexadecimal Caret
\e \033 \x1B ^[
\a \007 \x07 ^G
\n \012 \x0A ^J

Full list with details to C0 and C1

Full table

The caret notation can be used to enter a control code by hand by holding down the control key and pressing the caret notation.

This is also why some characters in emulators can’t be bound without rebinding others - such as CTRL-[, which is the caret notation for escape, or CTRL-I, which is the caret notation for a horizontal tab. They are indistinguishable for the terminal application, since the emulator send the same keycode for them.

How applications signal colours (and other things) to the emulator

Terminal applications use that CSI and the escape codes to signal to the emulator which action has to be taken - like the previously mentioned colours and cursor movement.

Text attributes are set with ESC[{numbers}m, where {numbers} may be one number or multiple numbers separated with a semicolon and reset with ESC[0m, where the zero may be omitted.

Note: ESC[m resets all text attributes, not only the colours.

Red text on a black background would be signaled with ESC[31;40m and underlined with ESC[4m:

#include <stdio.h>

int main() {

    //    +- begin escape sequence (ANSI C notation)
    //    |  +- foreground colour
    //    |  |+- red
    //    |  || +- background colour
    //    |  || |+- black
    //    |  || || +- end of text attribute sequence
    puts("\e[31;40m Red text on black background");

    //    +- begin escape sequence (hexadecimal notation)
    //    |    +- underlined text attribute
    //    |    |+- end of text attribute sequence
    puts("\x1B[4m This is red underlined text on black background");

    //    +- begin escape sequence (octal notation)
    //    |    +- end of text attribute sequence
    puts("\033[m This is default");
    //    interpreted as ESC[0m by the emulator
}

256 colour terminals

Most modern terminals also support the extended colours - these are signaled with the text attribute codes 38 and 48 for foreground and background respectively in the following form: ESC[{fg,bg};5;{number}m

The 5; specifies the 8bit colour format, with three bits for red, three for green and two for blue, resulting in 256 different combinations:

R R R G G G B B
 2^3 * 2^3 * 2^2 = 256

The {number} represents a colour in the 256 CLUT:

Each of these colours also has an associated name that may be used in some applications, viewable in /usr/share/X11/rgb.txt if your distros maintainer put it there.

Example:

#include <stdio.h>

int main() {
    //    +- begin escape code
    //    |  +- signal extended foreground colour sequence
    //    |  |  +- signal 8bit colour sequence
    //    |  |  | +- colour number
    //    |  |  | | +- end of text attribute sequence
    puts("\e[38;5;34m green foreground \e[m");
}

True colour / 24bit colour terminals

True colour terminals have the advantage that elaborate colourschemes do not rely on the user having to modify their CLUT to work correctly.

This is a pleasant change, since modifying the CLUT for the colourscheme for one application means that every other application is also affected by the new CLUT.

This is achieved by separately passing the red, green and blue parts to the emulator according to ISO 8613-3: ESC[{fg,bg};2;{red};{green};{blue}m, with each part being signaled by a number between 0 and 255.

However, the specifics are implementation-defined - your mileage may vary.

Example:

#include <stdio.h>

int main() {
    //    +- begin escape code
    //    |  +- signal extended foreground colour sequence
    //    |  |  +- signal 24bit colour sequence
    //    |  |  | +- red part
    //    |  |  | |   +- green part
    //    |  |  | |   |   +- blue part
    //    |  |  | |   |   |  +- end of text attribute sequence
    puts("\e[38;2;255;255;255m White foreground \e[m");
}

How applications detect colour (and other) capabilities of terminals

Terminals signal their capabilities using terminfo databases (see terminfo(5) for a deeper explanation), accessible e.g. by tput(1).

An emulator sets the environment variable TERM and executes a command, by default this is the users default shell (what SHELL contains, usually set by by chsh(1) or on user creation) or what was passed to the emulator as a flag to execute.

The application then uses the content of TERM to check its capabilities, such as text attributes, in the terminfo databases.

Try it yourself:

$ TERM="screen" tput colors
8
$ TERM="screen-256color" tput colors
256

The screen terminfo is only capable of eight colours, the simplest CLUT. screen-256color on the other hand supports the extended 256 colors. Based on these entries applications such as vim, mutt, newsbeuter and IRC clients decide how and which escape codes to send to the emulator.

Note: A true colour terminal may return 888 as ‘number’ of possible colours, they represent that each part of the colour is 8 bits, hence 24bits of colour. Since each part can be between 0 and 254 true colour results in 16581375 different combinations.

The same goes for all capabilities and terminfos, e.g.:

$ TERM="screen" tput sitm
$ echo $?
1
$ TERM="tmux" tput sitm
$ echo $?
0

sitm is the capability for the italic text attribute (a full list of the capabilities can be found in terminfo(5) mentioned above). When the capability is not a number the exit code has to be checked, since tput(1) sets an exit code of 1 if the capability is not set for that terminfo. It’d be the same if the terminfo signals that it doesn’t support colours at all.

Note: The tmux terminfo is relatively new and included since ncurses 6.0 - if you run a stable distro you can use the rxvt-unicode terminfo to test for sitm instead.

Putting it all together

With the accumulated knowledge on how applications signal colours to the emulator and how the emulator signals its capabilities to applications it should be understood how colours in conclusion work:

  1. The emulator is started and sets the environment variable TERM according to its capabilities
  2. The shell ‘inside’ the emulator is started (I am quoting ‘inside’, because it is actually not correct and a lot more is going on, but that’ll come another time)
  3. The user starts an application from the interactive shell
  4. The application detects the capabilities and prints escape codes according to those detected capabilities based on TERM

Point 4 also includes that the application checks on how many colours the emulator supports - if the terminfo signals that it only supports eight colours the application will only print these eight colours.

This works fine, since each emulator should set the correct TERM by default and the user should never touch this variable without good reason.

However terminals in binary distributions often install two versions, an eight colour and a 256 colour version, executable with $emulator and $emulator-256color, which respectively run the same code with different TERMs, so e.g. urxvt runs with TERM=rxvt-unicode even though the code itself does support 256 colours.

Users with no prior knowledge on how exactly terminals and their capabilities work set TERM by hand in their shell init files. In the previous example such a user would put export TERM=rxvt-unicode-256color in e.g. .bash_profile.

This breaks as soon as the user is not using urxvt or a multiplexer or an application that does not detect capabilities with the terminfo database but after the content of TERM (TERM-sniffing, almost as bad as user agent sniffing in web browsers).

Why modifying the CLUT for a specific colourscheme can be bad

Colourschemes that require the user to modify the CLUT usually also require to use that colourscheme for all applications that support colours in some way.

The default colours are chosen in a way that they are distinct enough not to cause confusion, but a modified CLUT can change that, resulting in indistinguishable colours in some applications.

Modifying the CLUT in a way that the colours are completely different is bad too:

Most applications allow the user to use names instead of colour codes - assuming a user changed the eight colour CLUT so that the first colour 0 is red (#FF0000) and the second colour 1 a shade of blue (#0000F8).

NUMBER NAME ORIGINAL VALUE NEW VALUE
0 BLACK #000000 #FF0000
1 RED #FF0000 #0000F8

Within an application the following configuration is applied:

color.fg = red
color.bg = black

These are translated by the application to:

color.fg = color2
color.bg = color1

Since red is the second and black the first color in a traditional CLUT.

That results in an escape code \e[31;40m which, with an unmodified CLUT, actually would print a red foreground and a black background - but since the CLUT was changed it now prints a blueish foreground and a red background.