CS 5330 · Assessment #2

Learning Without Labels

An interactive walkthrough of contrastive self-supervised learning via SimCLR. Explore how embedding spaces form, how augmentation creates positive pairs, and how the NT-Xent loss works.

Chen et al. 2020 — arXiv:2002.05709
Demo 01
Embedding Space Training
Watch how contrastive learning organizes representations. Each colored cluster is a different class. At epoch 0 embeddings are random — by epoch 50 they've separated into tight clusters without a single label.
What to look for: Points of the same color (class) pull together while different classes push apart. This happens purely from the positive/negative pair signal — no class labels are used.
Epoch 0 / 50
NT-Xent Loss
Legend
How it works Each point is a 2D embedding. Same-class points are positive pairs — the loss pulls them together. All cross-class pairs are negatives — they get pushed apart.
Demo 02
Augmentation Playground
Toggle augmentations on/off and see how two different views of the same image form a positive pair. The model must learn that these two views — despite looking different — represent the same underlying image.
SimCLR key finding: Crop + Color Jitter together is the strongest combination. Each augmentation alone barely helps — their composition is what forces the encoder to learn meaningful structure.
View A — x̃ᵢ
View B — x̃ⱼ
Both views are a positive pair — same image, different augmentations. The encoder must produce similar embeddings for both.
Demo 03
NT-Xent Loss Explorer
Adjust temperature τ and batch size N to see how the loss landscape changes. The chart shows loss values across different positive-pair similarity scores.
Temperature τ 0.07
Controls sharpness. Lower τ → model focuses on hardest negatives.
Batch Size N 256
Larger batches = more negatives = harder problem = better learning.
Positive Similarity 0.80
sim(zᵢ, zⱼ) — cosine similarity of the positive pair.
ℓ(i,j) = −log [
  exp(sim(zᵢ,zⱼ)/τ)
  ————————————————
  Σₖ exp(sim(zᵢ,zₖ)/τ)
]
Loss ℓ(i,j)
Negatives
Numerator
Denominator
Loss vs. Positive Pair Similarity — shaded region shows current τ's effect
Loss vs. Batch Size (at current τ and similarity)