Pre-Training
-
Masked Language Model
Predicts randomly masked tokens from context; primary pre-training objective for bidirectional encoders like BERT.
-
Fine-Tuning
Adapting a pre-trained model to a downstream task by training on task-specific data; standard approach in modern NLP.
-
Cloze Task
Predicting masked tokens from context; unsupervised pre-training objective where random words are hidden and must be inferred.
-
BERT
Bidirectional Encoder Representations from Transformers; bidirectional transformer pre-trained with masked language modeling, foundational for NLP tasks.