Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Lottery Ticket Hypothesis

    Also known as:
    Winning Ticket
    Sparse Subnetwork Theory
    Lucky Initialization
    Updated: 2/9/2026

    The hypothesis that every large neural network contains a small subnetwork ("winning ticket") that, trained alone with the same initialization, can achieve the full performance of the large network.

    Quick Summary

    The Lottery Ticket Hypothesis states that every large network contains a small subnetwork that, trained alone, achieves the same performance – foundation for efficient pruning.

    Explanation

    Frankle & Carlin showed in 2018 that 90%+ of weights can be removed if you keep the right initial weights. This revolutionized understanding of pruning and sparsity.

    Marketing Relevance

    The hypothesis provides theoretical foundations for more efficient AI: Why are large models needed if small subnetworks suffice? Potential for drastic cost reduction.

    Example

    Researchers find a "winning ticket" in BERT with only 10% of parameters – it achieves 98% of original accuracy and infers 5x faster.

    Common Pitfalls

    Finding winning tickets requires expensive iterative pruning. Not all architectures have clear winning tickets. Transfer between tasks not guaranteed.

    Origin & History

    Jonathan Frankle and Michael Carlin (MIT) published "The Lottery Ticket Hypothesis" in 2018. It won the Best Paper Award at ICLR 2019 and inspired hundreds of follow-up works on sparsity.

    Comparisons & Differences

    Lottery Ticket Hypothesis vs. Pruning

    Pruning removes weights after training; Lottery Ticket shows the right subnetworks already exist before training.

    Lottery Ticket Hypothesis vs. Neural Architecture Search

    NAS searches for new architectures; Lottery Ticket finds optimal substructures in existing networks.

    Related Services

    Related Terms

    👋Questions? Chat with us!