A language model that boots as firmware

Most of the AI conversation assumes a data center on the other end of a wire. Atome asks a different question: how small does a language model have to be before it can live entirely inside a two-dollar chip?

The target

A bare microcontroller — the kind that costs about $2 retail and shows up in toys, sensors, and appliances. No operating system. No network. No floating-point unit you can rely on. At a default 60K-parameter configuration, Atome compiles to roughly 20 KB of flash and runs as the device's firmware.

Ternary all the way down

Atome's weights are ternary, so the core matrix multiply needs no floating-point multiplication on the hot path — just adds and sign flips. That's what makes inference viable on hardware that has no business running a language model. The same three-value idea runs through the whole Tilelli line; here it's pushed to the hardware floor.

The claim we actually make

After a prior-art audit, we narrowed the headline to something we can defend: the first ternary LM with bit-exact Python ↔ C99 parity in a zero-heap C99 engine on $2 microcontrollers. "Bit-exact" is the load-bearing word — the reference Python and the on-chip C produce the same outputs, verified by tests, and confirmed on an ARM Cortex-M3 (under QEMU) down to FP32 epsilon. No heap allocation anywhere in the engine.

What it isn't

It isn't a chatbot in your pocket. A 60K-parameter model is a demonstration of a deployment floor, not a frontier assistant. We did not hand-roll exotic cryptography or claim capabilities the size doesn't support. The interesting thing is the engineering envelope: a real, reproducible LM that fits where one supposedly can't.

It's open under Apache-2.0 at github.com/TilelliLab/atome-lm, with the product page at atomelm.com.

Published 29 May 2026 · Corrections: hello@tilelli.tech