Dog Breed Classification — CNN from Scratch vs. Transfer Learning

What it covers

A benchmarked comparison of four deep learning approaches on a 74-class fine-grained dog breed dataset (12,891 images):

CNN from scratch — VGG-style stacked convolutions + max pooling
Tuned CNN — adds Dropout regularization and an additional conv block
Transfer learning with InceptionV3 — frozen ImageNet backbone with a trainable head
Data augmentation — flips and contrast on top of the scratch CNN

Headline result

Scratch CNN: ~5-7% validation accuracy (barely above chance on 74 classes)
Tuned CNN with Dropout: ~11% validation accuracy
InceptionV3 fine-tuned: ~96% validation accuracy

Takeaway

On a 13k-image dataset with 74 classes, transfer learning isn’t an optimization — it’s the only viable approach. The scratch CNN simply doesn’t have enough capacity relative to the data to learn useful representations. ImageNet pretraining gives InceptionV3 the visual prior it needs to adapt quickly to the target task.

Data augmentation on the scratch CNN actually made things worse — because the model was too shallow to benefit from distributional variety. A cautionary tale on the “always augment” reflex.

Stack

Framework: TensorFlow / Keras
Data pipeline: tf.data.Dataset
Environment: Google Colab with GPU

What it covers#

Headline result#

Takeaway#

Stack#

Links#

What it covers

Headline result

Takeaway

Stack

Links