Lancaster Stemmer

What it is

The Lancaster Stemmer is the same algorithm as the Paice/Husk Stemmer. The name comes from Lancaster University, where Chris Paice developed the algorithm and published it in 1990. Both names appear in the literature and in library documentation; the institutional name tends to dominate in Python tooling, while “Paice/Husk” is more common in IR research papers.

The most widely encountered use of this name is in NLTK, which exposes the algorithm as nltk.stem.LancasterStemmer:

from nltk.stem import LancasterStemmer

stemmer = LancasterStemmer()
stemmer.stem("computational")  # → "comput"

Aside from the name, there is no algorithmic difference. The underlying rule table, loop-back control structure, and aggressive stripping behaviour are identical to what the IR literature describes as the Paice/Husk Stemmer.

For a full explanation of how the algorithm works, its rule notation, and guidance on when to use it, see the complete entry.

Lancaster Stemmer

What it is

See also