Language-Model

Zero-Shot Learning

Model performs task without explicit training examples; relies on pre-training and task description in natural language.
Prompt Engineering

Crafting input text to elicit desired behaviour from language models without retraining; critical skill for modern LLMs.
Perplexity

Exponentiated negative average log probability; measures how well a language model predicts a sample. Lower is better.
N-Gram Language Model

Language model estimating token probabilities from observed n-gram counts; foundation of statistical NLP before neural methods.
Language Model

Probability distribution over sequences of tokens; predicts next token given context. Foundation of NLP from n-grams to large language models.
Instruction Tuning

Fine-tuning language models on diverse (instruction, response) pairs to improve generalization and follow natural language instructions.
Hallucination

Generating plausible-sounding but factually incorrect content; a key limitation of language models, especially on knowledge-intensive tasks.
Grounding

Connecting model outputs to verifiable external sources; reduces hallucination by anchoring generation in retrieved facts or documents.
GPT

Generative Pre-trained Transformer; autoregressive decoder-only model for text generation and language understanding, published by OpenAI from 2018 onwards.
Few-Shot Learning

Model generalises from small number of prompt examples without explicit retraining; enabled by scale in large language models.
Context Window

Maximum number of tokens a language model can process in one pass; determines how much context the model sees. Typical values range from 512 to 128k tokens.
Causal Language Model

Predicts next token from previous tokens; autoregressive objective for generative models like GPT, enabling text generation.
BERT

Bidirectional Encoder Representations from Transformers; bidirectional transformer pre-trained with masked language modeling, foundational for NLP tasks.