Skip to content

Archives

A Geometric Calculator Inside a Neural Network

  • A Geometric Calculator Inside a Neural Network

    The way that LLMs perform numerical arithmetic using circles and spirals is really fascinating. This page is a great exploration of that topic, using Llama 3.1 8B.

    Language models use a group of circles in activation space to represent a single number. Each circle corresponds to the number modulo a second number, i.e., the remainder after division.[1] For example, the number 17 would be represented as a 1 on the mod-2 circle, 2 on the mod-5 circle, 7 on the mod-10 circle, and 17 on the mod-100 circle.[2] Several prior works have established that circular features exist across multiple different LLMs [...]

    Using a bunch of circles to represent a number probably seems like an alien solution, but it is a common mathematical technique known as a Fourier decomposition (see the paper for more detail).

    Each of the inputs and the output of the addition module is represented using such a set of circles, and the circuitry within the module works by doing computations over these circles.

    Tags: llms language arithmetic maths calculation fourier circles