Assignment 3

CS370: 100 points if Task 1 is completed fully. No development of Task 2 is required.

CS370-Honors and CS-GY-6613: 100 points equally split between the two tasks. Task 1 without a tutorial explanation of Faster-RCNN, Tasks 2 completed fully.

Sports Analytics

Multi-Object Tracking (MOT) is a core visual ability that humans poses to perform kinetic tasks and coordinate other tasks. The AI community has recognized the importance of MOT via a series of competitions.

The assignment will give you the opportunity to apply probabilistic reasoning to sports analytics - a sizable market for AI. In this assignment the object classes are person and ball and you will demonstrate the ability to reason over time using Kalman Filters.

Figure 1: Sports analytics is a growing field that uses data to inform decision making in sports. In this assignment, you will use object tracking and Kalman filters to track soccer players and the ball in a soccer game.

Task 1: Faster RCNN

Process the video using a framework and implement object detection based on Faster-RCNN. You dont need to implement the detector from scratch but you need to incorporate the source code in your implementation in a form of a notebook that will explain all the steps of the Faster-RCNN algorithm. Ensure that all explanations are included in markdown cells and stages are clearly labels with headlines. Explanation will be considered complete when visualizations of the output of each block are provided as shown in the video(s) below.

For demonstrating the object detection stages/blocks the video frames mush be processed (you can show your work using one of the many frames).

Please note that both ball and person are included in the COCO dataset and you can use a pretrained on COCO backbone such as shown in here.

Figure 2: Faster RCNN - RPN explanation. For the Fast-RCNN backbone network component that together with the RPN constitute the whole detector, see this other video.

Figure 3: Fast RCNN - the backbone network component of Faster-RCNN is explained clearly here.

Task 2: Deep-SORT

Read this and this paper to understand the Deep-SORT algorithm. You can also watch the video below that explains the implementation of Deep-SORT.

Figure 4

Draw the architecture of the tracking solution using a diagraming tool that is compatible with Github rendering excalidraw.
Write a summary of key components of the architecture above including the equations of the Kalman filter and explain what the Hungarian Algorithm will do . For the later you may benefit from going through this implementation.
Implement Deep-SORT that will work with the object detector from the earlier step. You are free to use the code from the video but and you also need to detect and track the soccer ball.
Submit your video with all the bouding boxes of the players and the socker ball superposed on the test video below.

Figure 5: Test video for submission.

The test video results shown at the beginning of this page were generated using this implementation. You are free to also consult this code as well especially when it comes to the superposition part.

All youtube videos can be downloaded using the pytube library.