Decoding DINOv2: Next-Gen Computer Vision of Meta and Integration into FastAI’s Learner.

2 min readFeb 28, 2024

A simple review of the model and a few ways to use it in modeling tasks.

Introduction:

Dinov2 is a CV model, that utilizes self-supervised learning to achieve the results to beat the other standard CV modeling approaches. It has this amazing feature that allows it to be trained on any collection of images without the need for any metadata and it learns from all the image data supplied to it. The eye-catching fact is it does not require fine-tuning.

The Dinov2 model provides high-performance features that can be used for the classifiers. The experimentation results have shown commendable results for tasks like classification and segmentation. The paper written by Meta explains the performance is due to a combination of self-supervised features and task-specific modules.

The famous application that used Dinov2 listed in the Meta site about World Resources Institute used AI to map forests, tree by tree, dinov2 helped to accurately map and generalize the application.

Dataset:

Dinov2 was trained on publicly available datasets, web-scraped data, and a pipeline to streamline them with inspiration from LASER. The final dataset had 142 million images out of 1.2 billion sources of images.

Research Paper Link: https://arxiv.org/pdf/2304.07193.pdf

Key Highlights:

Dinov2 utilizes a technique called self-supervised learning means the model is trained with images without image labels.
The model can be trained without adding a lot of effort time or resources to label data.
The model will be capable of deriving meaningful and visualizing data since trained directly on the image.

Model Training using FastAI: https://medium.com/@rnagara1/image-segmentation-and-classification-with-fastai-e3fa58d44a56

For an introduction to FastAI please follow my article on medium :

How to load Dino v1 into the FastAI Model?

from fastai.vision.all import *
import torch
import timm

model_path = Path.home() / "training_artifact" / "dino-v1"
model_path.mkdir(exist_ok=True, parents = True)

dino_v1 = vision_learner(
          dls=dls,
          arch="vit_base_patch16_224.dino",
          pretrained=True,
          path=str(model_path),
          metrics=[],
)

How to load Dinov2 into the FastAI Model?

dinov2 = torch.hub.load("facebookresearch/dinov2", "dinov2_vitb14")
vit_b_dino = timm.create_model("vit_base_patch16_224.dino", pretrained=True)
vit_b_dino.backbone = dinov2

dinov2 = Learner(
         dls=dls,
         model=vit_b_dino,
         path=str(model_path),
         metrics=[],
)

References:

DINOv2: State-of-the-art computer vision models with self-supervised learning

Today, we are open-sourcing DINOv2, the first method for training computer vision models that uses self-supervised…

ai.meta.com

https://arxiv.org/pdf/2304.07193.pdf

DINOv2: Learning Robust Visual Features without Supervision

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened…

arxiv.org

fastai - Welcome to fastai

fastai simplifies training fast and accurate neural nets using modern best practices

docs.fast.ai