Part-of-Speech Tagging

What it is

Part-of-Speech (POS) tagging is a sequence labeling task assigning each word in text a grammatical category (noun, verb, adjective, pronoun, etc.). POS tags are used downstream for parsing, entity recognition, and linguistic analysis. Most modern POS taggers are neural (BERT-based) and achieve 97%+ accuracy on standard benchmarks.

[illustrate: Text with POS tags above each token: “The/DT cat/NN sat/VBD on/IN the/DT mat/NN”]

How it works

Tag sets: Standard schemes like Penn Treebank (48 tags) or Universal POS (17 tags)
- NOUN, VERB, ADJ, ADV, PRON, DET, ADP, CCONJ, SCONJ, PUNCT, etc.
Tagging approaches:
- Rule-based: Hand-crafted rules (rare now)
- Statistical: HMM, CRF with hand-engineered features
- Neural: BiLSTM or Transformer fine-tuned for tagging
Modern approach:
- Encode tokens with BERT or similar
- Classify each token’s POS tag
- Often combined with other tasks (NER, lemmatization)

Example

Sentence: "The quick brown fox jumps over the lazy dog"

POS tags (Universal POS):
The/DET quick/ADJ brown/ADJ fox/NOUN jumps/VERB over/ADP the/DET lazy/ADJ dog/NOUN

Penn Treebank (more fine-grained):
The/DT quick/JJ brown/JJ fox/NN jumps/VBZ over/IN the/DT lazy/JJ dog/NN

Variants and history

POS tagging dates to the 1960s with rule-based systems. HMM taggers (1980s–90s) enabled probabilistic approaches. CRF models (2000s) improved with structured predictions. Neural POS tagging (BiLSTM, 2016+) and BERT-based POS (2018+) achieved near-human accuracy. Contextual nature of POS (homonymy: “bank” as noun vs. verb) makes bidirectional context crucial.

When to use it

Use POS tagging for:

Parsing and syntax analysis
Lemmatization and stemming
Named entity recognition
Information extraction
Text analysis and corpus linguistics
Language learning systems

POS tagging is typically a preprocessing step, not end task. Most modern systems do joint tagging (POS + NER + lemmatization) for efficiency.