DP-SGD (Differentially Private SGD)
A training algorithm integrating Differential Privacy into Stochastic Gradient Descent – through gradient clipping and calibrated noise.
DP-SGD makes deep learning private: gradient clipping + noise guarantee no single data point is provable in the model.
Explanation
DP-SGD limits individual data point influence through gradient clipping (norm bounding) and adds Gaussian noise to aggregated gradients. The privacy budget (epsilon) accumulates over training epochs.
Marketing Relevance
Standard method for privacy-compliant deep learning: train models on sensitive data without memorizing individual data points.
Example
A healthcare startup trains a diagnosis model with DP-SGD (ε=8): The model learns patterns but cannot reproduce individual patient data.
Common Pitfalls
Accuracy loss with small epsilon. Hyperparameter tuning (clipping norm, noise multiplier) is difficult. Privacy accounting must be carefully managed.
Origin & History
Abadi et al. (2016, Google) formalized DP-SGD with Moments Accountant. Opacus (Meta, 2020) and TensorFlow Privacy made it practically usable. Rényi DP and Privacy Loss Distributions improved accounting.
Comparisons & Differences
DP-SGD (Differentially Private SGD) vs. Differential Privacy
DP is the mathematical framework; DP-SGD is the concrete implementation for neural networks.
DP-SGD (Differentially Private SGD) vs. Federated Learning
FL decentralizes training; DP-SGD privatizes gradients. Both together provide maximum privacy.