Deep Learning for Computer Vision

Instructed by Pantelis Monogioudis, Ph.D and staff.

GitHub Repo stars GitHub Workflow Status

What this course is all about

This course projects the vast field of statistical learning using differential deep neural architectures onto the computer vision application space. Beginning with the fundamentals of computer vision, the course offers an extensive coverage on essential topics such as object detection, semantic segmentation, using Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). The course extends on such fundamentals and treats computer vision in multimodal and generative settings enabling applications such as image captioning, visual question answering, and scene generation, using state-of-the-art models like Neural Radiance Fields (NeRF) and diffusion models. Students at the end of the course are well-equipped to design and deploy vision systems capable of complex tasks, from tracking and identifying objects in video streams to generating interactive responses based on visual prompts.

Logistics

NJIT Spring 2025: Tuesdays 6:00pm - 8:45pm, Jersey City Campus - 1J2

Communication

We use Discord for all communication and questions related to lectures and projects. Info has already been sent to you via Canvas and Brightspace. Please install Discord in your smartphones as well.

Office Hours Professor Office hours will be coordinated via Discord - the process is simple: direct message the professor and make arrangements for a 30min slot. After we agree on the slot, please send a Google calendar invitation to me and include Gmeet info (no Zoom please). Please include in your invitation the questions / issues you face so that we can have a productive meeting.

Grading

Midterm (15%)

Final (25%)

Project (30%)

Assignments (30%)

Staff (TAs)

TBP

Support Discord ticketing system. Open a ticket to address a grading issue and other issues that require a staff or professor response. All tickets are private between student and staff/professor.

Back to top