Assignment 3

In this assignment you will learn how Visual Inertial Odometry (VIO) works and apply it to the robot in a our maze environment that has been modified to include detectable objects and/or fiducial markers.

Points:

Camera Calibration: 25 points

VIO Principles: 25 points

Featurization of the maze: 10 points

VIO: 40points


Visual Inertial Odometry

VIO is a technique used in robotics and computer vision to estimate the position and orientation of a camera or sensor in 3D space by combining visual information from images with inertial measurements from sensors like Inertial Measurement Units (IMUs). This approach is particularly useful in scenarios where GPS signals are weak or unavailable, such as indoors or in urban environments.

Camera Calibration

To implement VIO we need to have a camera sensor in the robot and we need to calibrate it. Obviously the simulated camera is not needed to be calibrated but we still want you to understand the process so we will use a real camera such as the laptop camera you have and ask you to calibrate it.

The calibration process typically involves capturing multiple images of a known calibration pattern, such as a checkerboard, from different angles and distances. The images are then processed to extract feature points, and the camera parameters are estimated using optimization techniques. The result is a set of parameters that can be used to correct lens distortion and accurately map 3D points in the world to 2D image coordinates by estimating the intrinsic parameters of a camera. Intrinsic parameters include focal length, optical center, and lens distortion coefficients.

Study sections 13.1 and 13.2 of the Peter Corke’s textbook “Robotics, Vision and Control” to understand the camera calibration process. Printout the checkerboard, attach it to a cardboard and take images of it from different angles and distances as demonstrated in Fig 13.11. You can use the book’s github repo and the underlying code of chapter 13 notebook to do so. We want to see your images rather than the included images in the book (that come from OpenCV).

In practice the intrinsic parameters will be programmed into ROS2 via the camera_info topic. Check if these are set to non-ideal values - they are typically set to

camera_matrix:
  rows: 3
  cols: 3
  data: [fx, 0, cx, 0, fy, cy, 0, 0, 1]
distortion_model: "plumb_bob"
distortion_coefficients:
  data: [0, 0, 0, 0, 0]  # no distortion in sim

Apply Camera Calibration

In the maze environment, place objects on tables (you can reuse the objects from the previous assignment) and use the camera to estimate the relative pose of the objects. You can use the book’s functions in section 13.2.1 that wrap OpenCV’s solvePnP function to estimate the pose of an object given its 3D coordinates in the world and its corresponding 2D image coordinates.

You need the geometry of your created objects such as a cube to achieve this task. Alternatively feel free to place fiducial markers on the objects and use the aruco library to estimate the pose of the objects.

Establish a correspondence between a world coordinate system and the fixed objects in the scene. Use the relative pose estimate to estimate the pose of the camera in the world coordinate system. The camera pose is represented by a rotation matrix and a translation vector, which can be obtained from the extrinsic parameters of the camera.

Source: Visual Inertial Odometry
Back to top