Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Data Parallelism

    Also known as:
    Distributed Data Parallel
    DDP
    Data-Parallel Training
    Replicated Training
    Updated: 2/11/2026

    The simplest form of distributed training: Each GPU holds a complete model copy and processes different data batches – gradients are synchronized.

    Quick Summary

    Data parallelism replicates the model on every GPU and distributes data – simplest multi-GPU strategy with near-linear speedup.

    Explanation

    Each GPU processes a mini-batch, computes gradients locally, then gradients are averaged via AllReduce and all copies updated synchronously. Linearly scalable until communication becomes bottleneck. PyTorch DDP is the standard.

    Marketing Relevance

    Data parallelism is the default for multi-GPU training when the model fits on one GPU – simple, efficient, near-linear speedup.

    Example

    Fine-tuning a 7B LLM on 4 A100 GPUs: Each GPU holds the full model (14GB in FP16), processes batch size 8. Effective batch size: 32. Training 4x faster than single GPU.

    Common Pitfalls

    Model must fit entirely on each GPU. Redundant memory usage (N copies). Communication overhead with many GPUs. For very large models, FSDP/ZeRO is needed.

    Origin & History

    Data parallel training has existed since the 1990s. PyTorch DataParallel (DP) was the first simple implementation. PyTorch DDP (2019) improved efficiency through per-parameter AllReduce. Horovod (Uber, 2018) popularized ring AllReduce for efficient gradient synchronization.

    Comparisons & Differences

    Data Parallelism vs. Model Parallelism

    Data parallel: Whole model on each GPU, data distributed. Model parallel: Model split across GPUs – needed when model > 1 GPU.

    Data Parallelism vs. FSDP / ZeRO

    DDP holds complete model copies; FSDP/ZeRO shard model parameters across GPUs – saves memory with same speedup.

    Related Services

    Related Terms

    👋Questions? Chat with us!