Lottery Ticket Hypothesis
The hypothesis that every large neural network contains a small subnetwork ("winning ticket") that, trained alone with the same initialization, can achieve the full performance of the large network.
The Lottery Ticket Hypothesis states that every large network contains a small subnetwork that, trained alone, achieves the same performance – foundation for efficient pruning.
Explanation
Frankle & Carlin showed in 2018 that 90%+ of weights can be removed if you keep the right initial weights. This revolutionized understanding of pruning and sparsity.
Marketing Relevance
The hypothesis provides theoretical foundations for more efficient AI: Why are large models needed if small subnetworks suffice? Potential for drastic cost reduction.
Example
Researchers find a "winning ticket" in BERT with only 10% of parameters – it achieves 98% of original accuracy and infers 5x faster.
Common Pitfalls
Finding winning tickets requires expensive iterative pruning. Not all architectures have clear winning tickets. Transfer between tasks not guaranteed.
Origin & History
Jonathan Frankle and Michael Carlin (MIT) published "The Lottery Ticket Hypothesis" in 2018. It won the Best Paper Award at ICLR 2019 and inspired hundreds of follow-up works on sparsity.
Comparisons & Differences
Lottery Ticket Hypothesis vs. Pruning
Pruning removes weights after training; Lottery Ticket shows the right subnetworks already exist before training.
Lottery Ticket Hypothesis vs. Neural Architecture Search
NAS searches for new architectures; Lottery Ticket finds optimal substructures in existing networks.