Posterior Collapse
Posterior collapse occurs in VAEs when the encoder learns to copy the prior instead of producing informative latent representations.
Posterior collapse = VAE encoder ignores input and copies the prior – latent variables become useless even though the decoder produces good outputs.
Explanation
The decoder becomes so strong that it ignores the latent code – the encoder "collapses" to the prior N(0,1). KL divergence goes to zero, but latent variables carry no information. Countermeasures: KL annealing, free bits, β-VAE.
Marketing Relevance
Posterior collapse makes VAE-based generative tools useless – the latent variables control nothing anymore.
Example
A text VAE generates good-sounding sentences, but changes in latent space have no effect – everything comes from the autoregressive decoder.
Common Pitfalls
Detecting posterior collapse from low KL divergence alone (could also be well-regularized). Too aggressive KL weighting hurts reconstruction.
Origin & History
Bowman et al. (2016) first identified posterior collapse in text VAEs. KL annealing and free bits were proposed as countermeasures. Higgins et al. (2017) introduced β-VAE, which explicitly weights the KL term. The problem remains actively researched.
Comparisons & Differences
Posterior Collapse vs. Mode Collapse (GAN)
Mode collapse: GAN generator ignores modes of the data distribution. Posterior collapse: VAE encoder ignores inputs and uses only the prior.
Posterior Collapse vs. Overfitting
Overfitting memorizes training data; posterior collapse ignores latent structure completely.