Week 3: Image Classification with PyTorch
This week introduces students to using neural networks for machine vision using PyTorch. The focus is on using pretrained models to make predictions and evaluate network performance. Students will also be introduced to the Hugging Face model hub as a source for vision models.
Focus
- Use pretrained models for image classification
- Access and run image models from Hugging Face Hub
- Understand how image data is represented and processed
- Learn how CNNs work at a high level
- Visualize predictions and monitor model performance
- Understand class imbalance, precision/recall trade-offs, and bias in ML models
Hands-On Activities
- Load and run a pretrained model (e.g., ResNet18) from
torchvision.models - Load and run a Hugging Face vision model (e.g., ViT) for image classification
- Preprocess and classify images using PyTorch and Hugging Face
- Plot training and validation accuracy/loss
- Read and discuss the Amazon résumé screening case and how bias can emerge from data
Learning Outcomes
By the end of this week, students will be able to:
- Use a pretrained PyTorch model to classify new images
- Use Hugging Face Hub to explore and run vision models
- Understand the input/output structure of vision models
- Understand different metrics of vision models.
- Describe the basic architecture and function of a CNN
- Identify signs of overfitting from loss/accuracy plots
Resources
- torchvision.models documentation
- Hugging Face Vision Models
- Intro to hugging face
- Reuters article on bias at Amazon
Instructor Notes
Focus on inference to keep things practical and engaging. Emphasize that most real-world ML use involves using pretrained models. Consider keeping runtime short (e.g., use small datasets and few epochs).
We'll have to see how many students have GPUs on their local machines -- this may be a limitation -- consider using fashion3k dataset to limit runtime.