CS 5330 · Assessment #2

Learning Without Labels

An interactive walkthrough of contrastive self-supervised learning via SimCLR. Explore how embedding spaces form, how augmentation creates positive pairs, and how the NT-Xent loss works.

Chen et al. 2020 — arXiv:2002.05709

Demo 01

Embedding Space Training

Watch how contrastive learning organizes representations. Each colored cluster is a different class. At epoch 0 embeddings are random — by epoch 50 they've separated into tight clusters without a single label.

→

What to look for: Points of the same color (class) pull together while different classes push apart. This happens purely from the positive/negative pair signal — no class labels are used.

Epoch 0 / 50

NT-Xent Loss

—

Legend

How it works Each point is a 2D embedding. Same-class points are positive pairs — the loss pulls them together. All cross-class pairs are negatives — they get pushed apart.

Demo 02

Augmentation Playground

Toggle augmentations on/off and see how two different views of the same image form a positive pair. The model must learn that these two views — despite looking different — represent the same underlying image.

→

SimCLR key finding: Crop + Color Jitter together is the strongest combination. Each augmentation alone barely helps — their composition is what forces the encoder to learn meaningful structure.

View A — x̃ᵢ

View B — x̃ⱼ

Both views are a positive pair — same image, different augmentations. The encoder must produce similar embeddings for both.

Demo 03

NT-Xent Loss Explorer

Adjust temperature τ and batch size N to see how the loss landscape changes. The chart shows loss values across different positive-pair similarity scores.

Temperature τ 0.07

Controls sharpness. Lower τ → model focuses on hardest negatives.

Batch Size N 256

Larger batches = more negatives = harder problem = better learning.

Positive Similarity 0.80

sim(zᵢ, zⱼ) — cosine similarity of the positive pair.

ℓ(i,j) = −log [
  exp(sim(zᵢ,zⱼ)/τ)
  ————————————————
  Σₖ exp(sim(zᵢ,zₖ)/τ)
]

Loss ℓ(i,j)

—

Negatives

—

Numerator

—

Denominator

—

Loss vs. Positive Pair Similarity — shaded region shows current τ's effect

Loss vs. Batch Size (at current τ and similarity)