ResNet
A CNN architecture with skip connections (residual connections) that enables training of very deep networks.
ResNet introduced skip connections enabling training of extremely deep networks – still the standard backbone for transfer learning in computer vision.
Explanation
ResNets solve the vanishing gradient problem through residual connections that pass the input directly to deeper layers. This enabled networks with 100+ layers for the first time.
Marketing Relevance
ResNet is the most widely used backbone for transfer learning in computer vision – from feature extraction to fine-tuning.
Example
A pre-trained ResNet-50 is used as a feature extractor for a product image search engine.
Common Pitfalls
Often oversized for simple tasks. Deeper variants (ResNet-152) not always better than shallower ones. Prone to overfitting without augmentation.
Origin & History
Kaiming He et al. (Microsoft Research) published ResNet in 2015 and won the ImageNet Challenge with 152 layers – surpassing human accuracy for the first time. The paper became one of the most cited in AI history.
Comparisons & Differences
ResNet vs. VGG
VGG uses only stacked convolutions (max 19 layers). ResNet uses skip connections and scales to 100+ layers.
ResNet vs. Vision Transformer (ViT)
ResNet is CNN-based with local filters. ViT uses global self-attention. ViT needs more data but scales better.