2017Peer-reviewedWell documented
The Transformer
On the timeline · around 2017 ·
What happened
Researchers at Google published 'Attention Is All You Need,' introducing the Transformer — a neural network architecture built entirely on 'attention' mechanisms, dispensing with the recurrence used before. It trained faster and handled long-range context far better.
Why it matters
The Transformer became the foundation of virtually every modern large language model, from BERT to GPT — arguably the single most consequential AI architecture of the era.
Sources
- Vaswani et al., arXiv. Attention Is All You Need (2017) · Peer-reviewed