Syllabus
Course site
Your course site is available from the drop down menu of this landing page: https://pantelis.github.io/ that you need to bookmark.
Books
THRUN - Probabilistic Robotics, by Sebastian Thrun, Wolfram Burgard, and Dieter Fox, 2005. This book is the bible that many current roboticists grew up with - especially those that started the self-driving car revolution. It is required but it is not free to download.
LYNCH - Modern Robotics: Mechanics, Planning and Control. This book is oriented towards robotic manipulation but contains foundational material on general motion algebra. It is free to download.
CORKE - Robotics, Vision and Control: Fundamental Algorithms in PYTHON, by Peter Corke, 3rd edition, 2023. This is a hands on book that complements THRUN and LYNCH with great illustrations produced by implementations of various algorithms in robotics. This book is not required and it is not free to download. See also this repo for the code.
Students after completing this course will be able to
- To design the various subsystems involved in robotic agents with egomotion.
- To design and implement perception using sensor fusion such as computer vision together with other sensing streams (such as LIRAR) available to the robot designer.
- To implement planning algorithms for long-term tasks such as path planning or short term tasks such as motion or trajectory planning.
- To teach robots a control policy using just a world model simulator and allow the simulated policy to be transferred to the real world.
- To instruct robots with natural language.
- To enable students to program all of the above in an industry-standard setting focused around the ROS2 framework.
Planned Schedule
Please note that the course in its Fall 2025 version emphasizes on mobile robots and not on manipulation. As such we cover the following topics:
- Part I: Robotic Perception (Lectures 1-5)
- Part II: Localization and Mapping (Lectures 6-7)
- Part III: Task, Global and Local Planning (Lectures 8-9)
- Part IV: Reinforcement Learning, Instruction following (Lectures 10-12)
Week | Content |
---|---|
Lecture 1 | We start with an introduction to AI and robotics from a systems perspective with emphasis on the autonomous vehicles application domain. We explain the various systems that need to be engineered to allow safe and efficient self-driving and review prerequisites on programming (Python) as well as linear algebra and probability theory. With the help of the TAs & other tutorial videos, we also ensure that students have set up a programming environment necessary for the projects and assignments of the course. Reading: course site. |
Lecture 2 | The perception subsystem is the first stage of the processing chain in robotics. It processes and fuses multi-modal sensory inputs and is implemented using deep neural networks. Our focus here is to understand how prediction can be engineered by taking the maximum likelihood optimization principle and its associated cross-entropy loss function and applying it to neural networks - in this lecture focusing on fully connected and subsequently convolutional architectures for supervised regression and classification tasks. Reading: Course site. |
Lecture 3 | Naturally, the two tasks of classification and regression will be combined to form object detectors. Initially, we cover MaskRCNN and YOLO as representatives of detection architectures and introduce the engineering aspects of building object detection models with pre-trained feature extractors and using transfer learning for making predictions outside the scope of the pretrained models - a typical setting in robotic applications. Reading: Course site. |
Lecture 4 | In this lecture, we expand on object detection and provide the agent with the additional capabilities of semantic and instance segmentation that are essential for completing planning tasks in complex scenes. Reading: Course site. |
Lecture 5 | This lecture introduces probabilistic models that process the perceptive predictions over time and understand how the agent will track/update its time-varying belief about the state of the environment. This is achieved with recursive state estimation algorithms acting on a problem setting represented with dynamic Bayesian networks. This lecture introduces Bayesian filters in discrete and continuous state spaces (Kalman filters). All robots employ such filters. Reading: THRUN Chapters 2, 3. |
Lecture 6 | In Part I, we built a static agent that can track moving objects - we now expand on agents that move themselves (egomotion). We present well-established probabilistic motion models such as the velocity and odometry model and introduce localization as the problem of estimating the agent’s pose in the environment. We evaluate various algorithms that can solve such an estimation problem. Reading: THRUN Chapters 5, 7. CORKE Chapter 4 |
Lecture 7 | Up to this point, the mobile agent faced a localization estimation task assuming the knowledge of the environment’s map. We now expand on agents that can do Simultaneous Localization and Mapping (SLAM). Due to the plethora of the SLAM methods, we focus here on one that brings together Part I and Part II: Visual SLAM demonstrating how even monocular cameras can be used to move the agent safely in a dynamic environment. Reading: THRUN Chapters 9, 10. CORKE Chapter 6 |
Lecture 8 | After the successive enhancements of Part I and Part II, the agent now has a clear view of the environment state such as what and where the objects that surround it are, is able to track them as they move, and knows its location as it moves. In this lecture, we introduce the agent’s short and longer-term goals and how optimal planning under uncertainty solutions produces the best sequence of actions to reach its goal state. Although our final destination is optimal action-taking in dynamic stochastic environments, we start by considering global planning with the assumption that the goal state is feasible and the environment is known and deterministic. In this case, we can use search algrorithms such as A*, D*, RRT*, and PRM. Readings: Course site. |
Lecture 9 | We now extend our assumptions: the utility of the agent now depends on a sequence of decisions, and the stochastic environment offers a feedback signal to the agent called reward. We review how the agent’s policy, the sequence of actions, can be calculated when it fully observes its current state (MDP) and also when it can only partially do so (POMDP). The discussion here is on fundamentals and goes through an understanding of the Bellman expectation backup and optimality equations following David Silver’s (DeepMind) lecture series at Oxford. Readings: Course site |
Lecture 10 | The prediction and control algorithms that approximate the optimal MDP policies using either a world model or a model-free approach are known as Reinforcement Learning (RL). This lecture establishes the connection between MDP and RL and dives into deep learning, focusing on Deep RL algorithms applicable in robotics - mostly model-free methods. Reading: Course site |
Lecture 11 | Instruction following requires the ability of the robot to collaborate with human operators in either natural language or from demonstrations. Here we will review the neural architectures that model language and multimodal reasoning via Vision Language Models (VLM). |
Lecture 12 | The concluding lecture builds on VLMs and introduces the Vision Language Action (VLA) models that are end-to-end trainable models that parse instructions, perceive the environment and act based on both instruction and perception. |
Lecture 13 | Review - last lecture before the final exam. |