Facial Keypoints Detection

ResNet predictions on Kaggle test faces

Model: ResNet (12 blocks) RMSE: 2.10 Keypoints: 15 per face Input: 96×96 grayscale

1 / 10

Detected Keypoints (15 points)

Architecture

6-stage ResNet with 12 residual blocks (32→512 channels), BatchNorm, and 1×1 conv skip connections. ~4.2M parameters.

Training

Adam optimizer (lr=1e-4) with StepLR scheduling (step=5, gamma=0.1). Early stopping with patience=5 on validation loss.

Key Innovation

NaN-aware MSE loss masks missing labels, training on all 7,049 samples instead of discarding the ~70% with incomplete annotations.

Output

30 values (x,y for 15 keypoints): eyes, eyebrows, nose tip, mouth corners, and lips. Predictions shown as red dots overlaid on each face.

← → Navigate · Space Auto-play · G Grid view