Image Segmentation and Classification with FASTAI.

Rajath Nag Nagaraj (Raj)
4 min readFeb 27, 2024

--

A practical example of building an image classification and segmentation model using FASTAI V2.

Image Credits: OpenAI. (2024). ChatGPT [Large language model]. /g/g-5GgNWdBMI-image-creator-generator-mid-journey-v6

Introduction to Image Segmentation and Classification:

Image segmentation is a process where it breaks the image not just by identifying objects like cats, dogs, tables, etc. but also by outlining their shapes by pixel level. The process separates the foreground like cats, dogs, etc. from the background and this will pave the way for concentrating on image content.

why image segmentation is necessary?

Image segmentation plays a critical role in various applications, a few of which are listed below:

  • Medical Imaging: Identify tumor or region of interest which requires precision segmentation of area of interest.
  • Home Survey: Identify key things for a property like calculating the tree overhang on roofs, surface area of roofs, etc.
  • Self-driving cars: Segmenting objects like pedestrians and other objects is necessary for autonomous driving.

Image classification is a process of identifying a vs b or cat vs dog in an image. It’s a process of assigning a category or a class to an entire or part of an image.

why image classification is necessary?

  • Photo Organization: Imagine if you had to practically sort all images with nickel coins with silver coins manually.
  • Content Moderation: Popular social media sites use content moderation models to catch sensitive content or inappropriate content images.
  • Medical Diagnosis: Classify tumor scan with non-tumor scan.

Part A: Understanding FastAI and its ability to perform deep learning.

Fastai is a very popular deep-learning library designed to train and deploy machine-learning models. Fastai was developed by Jeremy Howard and Sylvain Gugger. Fastai is built on top of Pytorch library, it was made keeping in mind it should be easier and more efficient for researchers and ML practitioners. This is possible because of the high level of abstraction in the library to perform complex things and thus users to focus on specific needs.

The Fastai library is built keeping in mind below 3 things:

  • Accessibility: Deep learning is easier for practitioners.
  • Simplicity: Simplify the complex process of deep learning modeling and building the architectures.
  • Efficiency: Faster experimentation and quicker timeline for practitioners to build deep learning models.

Best Resources:

Part B: Exploring the code:

Dive into a step-by-step example of CNN Learner for Image Classification:

Install Necessary Packages:

pip install torch torchvision fastai

Step 1: Import Necessary Libraries:

from fastai.vision.all import *

Step 2: Load the Data:

path = untar_data(URLs.PETS)/'images'

Step 3: Prepare the Data:

Preparing the data for training using FastAIs DataBlock which allows for flexible data preparation. How to split the dataset into training and validation sets and apply transforms is shown below:

def is_cat(x): 
return x[0].isupper()

dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))

Step 4: Create the Model:

Creating a convolutional neural network (CNN) model using fastai cnn_learner or vision_learner. We have a wide range of architectures available to choose from. Most likely all the pytorch models should be able to be loaded into the cnn_learner or Learner module.

learn = cnn_learner(dls, resnet34, metrics=error_rate)

Step 5: Train the Model:

Train the model using the fit_one_cycle method, when the freeze epoch is zero but there are other ways available when the freeze epoch is higher than 0.

learn.fit_one_cycle(4)

Step 6: Evaluate the Model:

After training, you can evaluate the model’s performance and see the training & validation loss.

learn.show_results()
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

Interp class to see the detailed understanding of the model performance like confusion matrix, f1-score, and epoch.

Dive into a step-by-step example of UNET Learner for Image Segmentation:

from fastai.vision.all import *

path = untar_data(URLs.CAMVID_TINY)

dls = SegmentationDataLoaders.from_label_func(
path, bs=8, fnames = get_image_files(path/"images"),
label_func = lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
codes = np.loadtxt(path/'codes.txt', dtype=str)
)

learn = unet_learner(dls, resnet34)

learn.fit_one_cycle(4)

learn.show_results(max_n=6, figsize=(7,8))

learn.unfreeze()

learn.fit_one_cycle(2, lr_max=slice(1e-6,1e-4))

Part C: Best Practices:

General Best Practices:

  • Data Preparation: Utilize data augmentation to improve model generalization, use ImageDataLoaders.from_folder for classification and SegmentationDataLoaders.from_label_func for segmentation to load data efficiently.
  • Model Selection: Use cnn_learner or vision_learner for classification modeling. Use unet_learner or Learner to load segmentation models.
  • Learning Rate Finder: learner.lr_find() provides an appropriate or optimal learning rate.
  • Training: fit_one_cycle() to try out the losses, if needed to freeze certain layers use fine_tune().
  • Regularization and Avoiding Overfitting: alter weight decay (wd), learning rate (lr), freeze epoch, and other image transformations as needed.
  • Evaluation and Interpretation: Use ClassificationInterpretation for classification, SegmentationInterpretation for segmentation in fastai interpretation.

Part D: Conclusion:

  • Fastai Simplifies Deep Learning: Fastai offers an accessible, high-level API for deep learning, making advanced tasks like image segmentation and classification more approachable for coders of all levels.
  • Vision Learner: The Vision Learner in Fastai provides a framework for approaching image classification tasks, streamlining the process from data preparation to model evaluation with practical examples.
  • UNet Learner: The UNet Learner is highlighted as a key tool for image segmentation projects, enabling precise pixel-wise classification and supporting a range of applications from medical imaging to autonomous driving.
  • Practical Guidance Provided: The post includes step-by-step guides, code snippets, and sample projects to help readers understand and implement their models for both classification and segmentation tasks.

References:

--

--

Rajath Nag Nagaraj (Raj)
Rajath Nag Nagaraj (Raj)

Written by Rajath Nag Nagaraj (Raj)

I go by "Raj" and I am a Senior Applied Gen-AI Scientist at a Fortune 100 Company.

No responses yet