Reparameterization Trick
The reparameterization trick enables backpropagation through stochastic sampling operations by treating randomness as an external variable.
The reparameterization trick separates randomness from gradient flow – the elegant trick that made VAEs and thus modern generative AI trainable.
Explanation
Instead of sampling directly from z ~ N(μ, σ²) (not differentiable), z = μ + σ * ε with ε ~ N(0,1) is computed. The gradient flows through μ and σ, ε is external. This enabled end-to-end training of VAEs for the first time.
Marketing Relevance
Without the reparameterization trick, there would be no VAEs, no latent diffusion, and no modern generative AI like Stable Diffusion.
Example
VAE encoder outputs μ and σ. Instead of z = sample(N(μ,σ²)): z = μ + σ * ε, where ε ~ N(0,1). Gradient flows through μ, σ to the encoder.
Common Pitfalls
Only works for certain distributions (Gaussian, not directly for discrete). Numerical instability with very small σ.
Origin & History
Kingma & Welling (2013) and Rezende et al. (2014) independently introduced the trick. It was the key innovation enabling VAEs. The concept was extended to Gumbel-Softmax (discrete variables) and normalizing flows.
Comparisons & Differences
Reparameterization Trick vs. REINFORCE / Score Function Estimator
Reparameterization has low variance but needs differentiable sampling paths; REINFORCE works for discrete distributions but has high variance.
Reparameterization Trick vs. Straight-Through Estimator
Reparameterization is mathematically exact for continuous distributions; straight-through is a heuristic for discrete operations.