Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence
    (Stratifizierte Stichprobe)

    Stratified Sampling

    Also known as:
    Stratified Sampling
    Stratified Split
    Proportional Sampling
    Updated: 2/10/2026

    Sampling method that ensures class/group proportions in the sample match the overall distribution.

    Quick Summary

    Stratified sampling preserves class distribution when splitting data – essential with class imbalance so rare classes are represented in every split.

    Explanation

    Especially important with class imbalance: prevents rare classes from being under- or over-represented in test or validation sets.

    Marketing Relevance

    Stratified sampling is standard in train/test splits and K-Fold CV to ensure representative evaluations.

    Common Pitfalls

    Stratification can be difficult with very rare classes. Multi-labels require special stratification methods.

    Origin & History

    The method comes from survey statistics (Neyman 1934). In ML, it became standard through Scikit-learn and is default in StratifiedKFold and train_test_split.

    Comparisons & Differences

    Stratified Sampling vs. Random Sampling

    Random sampling can randomly exclude rare classes; stratified sampling guarantees proportional representation of each class.

    Stratified Sampling vs. Oversampling

    Stratified sampling preserves proportions; oversampling intentionally changes them to strengthen minority classes.

    Related Services

    Related Terms

    👋Questions? Chat with us!