Mixup
Data augmentation technique that creates new training examples by linearly interpolating between two existing examples.
Mixup linearly blends two training examples (inputs and labels) – simple augmentation that improves generalization and reduces overfitting and overconfidence.
Explanation
Mixup combines both inputs and labels: x_new = λ·x1 + (1-λ)·x2, y_new = λ·y1 + (1-λ)·y2, where λ is drawn from a Beta distribution.
Marketing Relevance
Mixup improves generalization, calibration, and robustness against adversarial examples with minimal implementation complexity.
Common Pitfalls
Too high mixup strength blurs class boundaries. Not suitable for all data types (e.g., text).
Origin & History
Introduced in 2017 by Zhang, Cisse, Dauphin & Lopez-Paz (Facebook AI Research). CutMix (2019) and Manifold Mixup extended the concept to spatial and latent domains.
Comparisons & Differences
Mixup vs. CutMix
Mixup blends entire images; CutMix cuts out a patch and replaces it with a patch from another image – preserves local structures better.
Mixup vs. Label Smoothing
Mixup creates new inputs and labels through interpolation; Label Smoothing only softens labels without changing inputs.