Week 3: Image Classification with PyTorch

This week introduces students to using neural networks for machine vision using PyTorch. The focus is on using pretrained models to make predictions and evaluate network performance. Students will also be introduced to the Hugging Face model hub as a source for vision models.

Focus

Use pretrained models for image classification
Access and run image models from Hugging Face Hub
Understand how image data is represented and processed
Learn how CNNs work at a high level
Visualize predictions and monitor model performance
Understand class imbalance, precision/recall trade-offs, and bias in ML models

Hands-On Activities

Load and run a pretrained model (e.g., ResNet18) from torchvision.models
Load and run a Hugging Face vision model (e.g., ViT) for image classification
Preprocess and classify images using PyTorch and Hugging Face
Plot training and validation accuracy/loss
Read and discuss the Amazon résumé screening case and how bias can emerge from data

Learning Outcomes

By the end of this week, students will be able to:

Use a pretrained PyTorch model to classify new images
Use Hugging Face Hub to explore and run vision models
Understand the input/output structure of vision models
Understand different metrics of vision models.
Describe the basic architecture and function of a CNN
Identify signs of overfitting from loss/accuracy plots

Resources

Instructor Notes

Focus on inference to keep things practical and engaging. Emphasize that most real-world ML use involves using pretrained models. Consider keeping runtime short (e.g., use small datasets and few epochs).

We'll have to see how many students have GPUs on their local machines -- this may be a limitation -- consider using fashion3k dataset to limit runtime.