Why 1-Bit Transformers Will Change the World

First published on HackerNoon

Thomas Cherickal
2 min readOct 11, 2024

--

Welcome to the New World!

Transformers, once the domain of companies and research organizations with 8-figure budgets, are undergoing the biggest disruptive change in their (short) history.

Once upon a time, incredibly expensive clusters of A100s and V100s Graphical Processing Units (GPUs) that cost millions of dollars to buy, run,and maintain were necessary for state-of-the-art research work.

However, things have changed.

Welcome to the age of 1-bit transformers, that run on desktop CPUs and low-end GPUs without sacrificing performance or capabilities and inference capacities!

The Scenario

The 1-bit Transformer was first introduced by Kim et al. (2020) as a way to reduce the memory footprint and computational complexity of the original Transformer architecture.

The key idea behind 1-bit Transformers is to quantize the weights and activations of the model to 1-bit values, i.e., -1 or 1.

This quantization process not only reduces the memory requirements of the model but also enables the use of binary operations, which are significantly faster than floating-point operations.

--

--

Thomas Cherickal
Thomas Cherickal

No responses yet