Pre-Training

Masked Language Model

Predicts randomly masked tokens from context; primary pre-training objective for bidirectional encoders like BERT.
Fine-Tuning

Adapting a pre-trained model to a downstream task by training on task-specific data; standard approach in modern NLP.
Cloze Task

Predicting masked tokens from context; unsupervised pre-training objective where random words are hidden and must be inferred.
BERT

Bidirectional Encoder Representations from Transformers; bidirectional transformer pre-trained with masked language modeling, foundational for NLP tasks.