Assignment 1

In this assignment you will be working on setting up your system and refreshing basic probability theory or basic linear algebra concepts such as Singular Value Decomposition (SVD). You are mandated to use the Pytorch namespace libraries such as pytorch.linalg, pytorch.rand and in general libraries in the pytorch.xyz namespace but not any derived or any other libraries. The idea is to implement from scratch the following without implementing every minute component such as random number generators etc.

Points:


Development Environment Setup

Ubuntu and MAC users

Install docker in your system and the VSCode docker and remote extensions.

Windows users

  1. Install WSL2.

  2. Ensure that you also follow this tutorial to setup VSCode properly aka the VSCode can access the WSL2 filesystem and work with the remote docker containers.

  3. If you have an NVIDIA GPU in your system, ensure you have enabled it.

All Users

Following the instructions of the course site with respect to the course docker container

  1. Install docker on your machine.
  2. Clone the repo (For windows users ensure that you clone it on the WSL2 filesystem.) Show this by a screenshot below of the terminal where you have cloned the repo.
  3. Build and launch the docker container inside your desired IDE (if you havent used an IDE before you can start with VSCode).
  4. Launch the virtual environment with rye sync inside the container and then show a screenshot of your IDE and the terminal with the (your virtual env) prefix.
  5. Select the kernel of your virtual environment (.venv folder) and execute the following code. Save the output of all cells of this notebook before submitting.
Source: Development Environment Setup

Getting to know the Torchvision library

Numerical computations must be very efficient for any real time system such as computer vision. Pytorch is a popular library for deep learning and it provides a powerful tensor library that can be used for numerical computations and the aim here is to learn and demonstrate the basic operations of the Pytorch tensors library.

Learn the basics

Use the this notebook to learn the basics of Pytorch tensors following along with the video. You can also use this excellent book chapter

Torchvision

Torchvision is a package that provides popular datasets, model architectures, and common image transformations for computer vision. It is a part of the PyTorch project and is widely used in the deep learning community for tasks such as image classification, object detection, and segmentation. Here we will touch upon the basics of Torchvision.

Transforming image tensors

Process the video. Note that you can sample the video to generate 1000 images total. You can use yt-dlp or other tools for downloading the video.

Create a custom Torchvision dataset that loads the images and applies transformations.

Understand the transformations and why we need to apply them using this image as an example and the Transforms V2 page.

Apply the transformation of resizing the image to 224x224 and normalizing the image using the mean and standard deviation of its images.

Visualizing the dataset

Using the Fiftyone library documentation, load the dataset in its internal MongoDB as a fiftyone dataset and visualize the Torchvision dataset using the fiftyone library. Note that the fiftyone app UI can be launched in a notebook cell.

Source: Getting to know the Torchvision library
Back to top