2018Peer-reviewedWell documented

BERT and the Rise of Large Language Models

On the timeline · around 2018 ·

What happened

Google researchers released BERT, a Transformer-based model pre-trained on huge amounts of text to understand language bidirectionally — reading context from both directions at once. It set new records across a wide range of language tasks.

Why it matters

BERT showed the power of pre-training large models on unlabeled text, the recipe that would drive the explosion of large language models.

Sources

Devlin et al., arXiv. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018) · Peer-reviewed