Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Stemming

    Updated: 2/10/2026

    Rule-based reduction of words to their stem by removing suffixes.

    Quick Summary

    Stemming reduces words to their stem using rules for search engines and text retrieval – fast but less accurate than lemmatization.

    Explanation

    Stemming cuts word endings: "running" → "run", "computers" → "comput". It is fast but imprecise – the stem doesn't have to be a real word.

    Marketing Relevance

    Stemming is used in search engines and information retrieval for text normalization.

    Common Pitfalls

    Over-stemming: Different meanings reduced to same stem. Under-stemming: Related forms not recognized.

    Origin & History

    Martin Porter developed the Porter Stemmer in 1980, which remains the most well-known algorithm. Snowball (Porter2) improved it in 2001 for more languages. With the rise of LLMs, stemming is losing importance but remains relevant in classical search systems.

    Comparisons & Differences

    Stemming vs. Lemmatization

    Stemming cuts suffixes using rules; lemmatization uses linguistic knowledge and produces real word forms.

    Stemming vs. Subword Tokenization

    Stemming normalizes for retrieval; subword tokenization splits for neural models – different goals and methods.

    Related Services

    Related Terms

    👋Questions? Chat with us!