Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Posterior Collapse

    Also known as:
    KL Vanishing
    Latent Variable Collapse
    VAE Posterior Collapse
    Updated: 2/11/2026

    Posterior collapse occurs in VAEs when the encoder learns to copy the prior instead of producing informative latent representations.

    Quick Summary

    Posterior collapse = VAE encoder ignores input and copies the prior – latent variables become useless even though the decoder produces good outputs.

    Explanation

    The decoder becomes so strong that it ignores the latent code – the encoder "collapses" to the prior N(0,1). KL divergence goes to zero, but latent variables carry no information. Countermeasures: KL annealing, free bits, β-VAE.

    Marketing Relevance

    Posterior collapse makes VAE-based generative tools useless – the latent variables control nothing anymore.

    Example

    A text VAE generates good-sounding sentences, but changes in latent space have no effect – everything comes from the autoregressive decoder.

    Common Pitfalls

    Detecting posterior collapse from low KL divergence alone (could also be well-regularized). Too aggressive KL weighting hurts reconstruction.

    Origin & History

    Bowman et al. (2016) first identified posterior collapse in text VAEs. KL annealing and free bits were proposed as countermeasures. Higgins et al. (2017) introduced β-VAE, which explicitly weights the KL term. The problem remains actively researched.

    Comparisons & Differences

    Posterior Collapse vs. Mode Collapse (GAN)

    Mode collapse: GAN generator ignores modes of the data distribution. Posterior collapse: VAE encoder ignores inputs and uses only the prior.

    Posterior Collapse vs. Overfitting

    Overfitting memorizes training data; posterior collapse ignores latent structure completely.

    Related Services

    Related Terms

    👋Questions? Chat with us!