Convolutional Neural Networks
- CNN
- ConvNet
- ResNet
Best for: Images, spatial data Aliases: CNN, ConvNet, ResNet
How it works
$$Y(i,j)=b+\sum_m\sum_n W(m,n)\,X(i-m,j-n)$$Each convolutional layer computes a stack of feature maps via local filters, $Y(i,j)=b+\sum_m\sum_n W(m,n)\,X(i-m,j-n)$ (cross-correlation), sharing weights across spatial positions to gain translation equivariance. Stacking convolutions with nonlinearities (ReLU) and pooling builds a hierarchy from edges to object parts, with a large receptive field at low parameter cost. Modern variants like ResNet add skip connections $y=F(x)+x$ to ease gradient flow in very deep stacks, and the whole network trains end-to-end by backpropagation on the task loss.
When to use
Spatial data with local, translation-invariant patterns — classic image classification, detection, and feature extraction.
Watch out
Lose global context vs. transformers on long ranges; need strong augmentation; large models overfit small datasets.
Common fields
Computer vision · medical imaging · defect detection · satellite imagery