Attention Is All You Need — The Transformer
Google researchers published the Transformer architecture, replacing sequential processing with attention mechanisms and enabling the AI revolution that followed.
In June 2017, a team of Google researchers — Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Lukasz Kaiser, and Illia Polosukhin — published "Attention Is All You Need," introducing the Transformer architecture. Unlike previous models that processed text sequentially (word by word), the Transformer used "self-attention" mechanisms to process all words simultaneously, understanding how each word relates to every other word in a sentence. This parallelism enabled dramatic scaling. The Transformer became the foundation for virtually every major AI model that followed: BERT, GPT-2, GPT-3, GPT-4, Claude, Gemini, LLaMA, and the entire large language model ecosystem. It is arguably the most consequential machine learning paper ever published.
More in Science & Technology
Cambrian Period
The Cambrian explosion saw the rapid diversification of most major animal phyla, fundamentally transforming life on Earth.
~570 BCEBirth of Pythagoras
Pythagoras, the Greek mathematician and philosopher, was born on the island of Samos.
~340 BCEAristotle's Contributions to Science and Philosophy
Aristotle made groundbreaking arguments for a spherical Earth and laid the foundations of logic, science, and Western philosophy.
c. 300 BCELibrary of Alexandria Founded
The most famous library of the ancient world was established in Egypt.