Siamese / Contrastive Networks
Best for: Similarity learning
How it works
$$\mathcal{L}_{ij}=-\log\frac{\exp(\mathrm{sim}(z_i,z_j)/\tau)}{\sum_{k} \exp(\mathrm{sim}(z_i,z_k)/\tau)}$$Two or more identical encoder branches with tied weights map inputs to embedding vectors $z=f_\theta(x)$ so that a similarity such as cosine or dot product reflects semantic closeness. Training optimises a contrastive objective that pulls embeddings of positive pairs together and pushes negatives apart, e.g. the InfoNCE loss $\mathcal{L}_{ij}=-\log\frac{\exp(\mathrm{sim}(z_i,z_j)/\tau)}{\sum_k\exp(\mathrm{sim}(z_i,z_k)/\tau)}$, where $(i,j)$ is a positive pair and the $k$ index negatives. The result is a metric space useful for verification, retrieval, or as a self-supervised pre-trained backbone.
Common fields
Face verification · product matching · semantic search