Engineering AI Agents
  • BOOK
    • Foundations
    • Training Deep Networks
    • Perception
    • Kinematics
    • State Estimation
    • Large Language Models
    • Multimodal Reasoning
    • Task Planning
    • Global Planning
    • Local Planning
    • Markov Decision Processes
    • Reinforcement Learning
    • VLA Agents
  • COURSES
    • Introduction to AI
    • AI for Robotics
    • Deep Learning for Computer Vision
    • DATA MINING - BEING PORTED
  • VIDEOS
    • Statistical Learning Theory
    • AI for Robotics
    • Deep Learning for Computer Vision
    • DATA MINING - BEING PORTED
  • ABOUT ME

Welcome !

Learn the concepts and engineer AI agents with real-time perceptive and language understanding abilities.

  • Use Jupyter notebooks to learn the concepts from scratch.
  • Simulate AI agents with egomotion using the Robotic Operating System (ROS2).
  • Build real-time 2d-perception pipelines.
  • Use AIOps tools to engineer data pipelines that create large training datasets for LLMs & LVMs.

Start Here

  • Foundations
  • 2d-perception
  • LLMs
  • Logics
  • Task Planning
  • Kinematics
  • MDP
  • RL

The Learning Problem
Statistical Learning Theory

The supervised learning problem statement.

No matching items

classification-detection

Introduction to Scene Understanding
In the previous chapters we have treated the perception subsystem mainly from starting the first principles that govern supervised learning to the deep learning…
 
Mask R-CNN Semantic Segmentation
The semantic segmentation approach described in this section is Mask R-CNN paper. Mask R-CNN is an extension of Faster R-CNN that adds a mask head to the detector. The mask…

Detectron2 Beginner’s Tutorial
Welcome to detectron2! This is the official colab tutorial of detectron2. Here, we will go through some basics usage of detectron2, including the following: * Run inference…

Mask R-CNN: A detailed guide with Detectron2:
Welcome to the Mask R-CNN with Detectron2 tutorial!

MaskRCNN Inference
import torch
from torchvision import datasets, transforms, models, ops, io
from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor
from torchvision.models.detec…

Mask R-CNN Demo
A quick intro to using the pre-trained model to detect and segment objects.

Mask R-CNN - Inspect Training Data
Inspect and visualize data loading and pre-processing code.

Mask R-CNN - Inspect Trained Model
Code and visualizations to test, debug, and evaluate the Mask R-CNN model.

Mask R-CNN - Inspect Weights of a Trained Model
This notebook includes code and visualizations to test, debug, and evaluate the Mask R-CNN model.
 
UNet Semantic Segmentation
No matching items

Finetuning Language Models for Text Classification - Patent Dataset
This notebook was submitted by NYU student Sky Achitoff
Pantelis Monogioudis

national-library-greece

Natural Language Processing
“You shall know a word by the company it keeps” (J. R. Firth 1957: 11) - many modern discoveries are in fact rediscoveries from other works sometimes decades old. NLP is…
 
Language Models Workshop
The following notebook is a from-scratch attempt on character-level language modeling. Its instructive for you to go through it first and then go through the corresponding…
 
CNN Language Model
The following was developed by Harini Appansrinivasan, NYU as part of an assignment submission.

language-model-google-search

Language Models
These notes heavily borrowing from the CS224N set of notes on Language Models.
 
LLM Inference
  1. NVIDIA’s Guide
  2. Hugging Face TGI

LSTM Language Model from scratch
This notebook was borrowed from Christina Kouridis’ github. The notation is different than the notation used in the LTSM section of the notes and will be changed in a next…

rnn-language-model

RNN Language Models
When we focus on making predictions based on a fixed window of context (i.e. the n previous words), in some cases, the window may not be sufficient to capture the…
 
Example of an RNN Language Model
Our aim is to predict the next character given a set of previous characters from our data string. For our RNN implementation, we would take a sequence of length 25…

pos-classification

Introduction to NLP Pipelines
In this chapter, we will introduce the topic of processing with neural architectures language in general. This includes natural language, code etc.

Text Tokenization
In earlier chapters we have limited the discussion to tokenizers that either produce a list of words or a list of characters. Its very important though to understand the…

distributional-similarity

Word2Vec Embeddings
In the so called classical NLP, words were treated as atomic symbols, e.g. hotel, conference, walk and they were represented with on-hot encoded (sparse) vectors e.g.
 
Word2Vec from scratch
This self-contained implementation is instructive and you should go through it to understand the word2vec embedding.
 
Word2Vec Tensorflow Tutorial
word2vec is not a singular algorithm, rather, it is a family of model architectures and optimizations that can be used to learn word embeddings from large datasets.…

rosetta-stone

RNN-based Neural Machine Translation
These notes heavily borrowing from the CS229N 2019 set of notes on NMT.

The BLEU Score
In 2002, IBM researchers developed the BiLingual Evaluation Understudy (BLEU) metric that remains, with its many variants to this day, one of the most quoted metrics for…

Attention in RNN-based NMT
When you hear the sentence “the soccer ball is on the field,” you don’t assign the same importance to all 7 words. You primarily take note of the words “ball” “on,” and “field…
 
Character-level recurrent sequence-to-sequence model
Author: fchollet
Date created: 2017/09/29
Last modified: 2020/04/26
Description: Character-level recurrent sequence-to-sequence model.
No matching items
  • 1
  • 2

abandoned-backpack

Automated Reasoning
We have seen in an earlier chapter where we introduced a dynamical system governing the state evolution of the environment that a state is composed of variables and such fact…
 
Logical Reasoning
 
Logical Agents
In this chapter we see how agents equipped with the ability to represent internally the state of the environment and reason about the effectiveness of possible actions using…

wumpus-entailment

Logical Inference
The wumpus world despite its triviality, contains some deeper abstractions that are worth summarizing.

wumpus-world

World Models
For each problem we can define a number of world models each representing every possible state (configuration) of the environment that the agent may be in.
No matching items

Task Planning
Task planning, at least in the approach presented here combines logic and search.

Specifying the engine name
The unified_planning.plot package provides useful functions to visually plot many objects

The Unified Planning Library
In this demo we will scratch the surface of the UP: we will set it up, we manually create the blocksworld domain using the UP API, we create a problem using a bit of python…

Planning Domain Definition Language (PDDL)
In the chapter of propositional logic we have seen the combinatorial explosion problem that results from the need to include in the reasoning / inference step all possible…
 
Logistics Planning in PDDL
We will use the Logistics domain to illustrate how to represent a planning task in PDDL.

Manufacrturing Robot Planning in PDDL
This is a real case that we tackled for a manufacturing company. This company devises supply chains to make pieces of medical equipments. A supply chain consists of…
No matching items

state-value-tree

Bellman Expectation Backup
In this section we describe how to calculate the value functions by establishing a recursive relationship similar to the one we did for the return. We replace the…

Bellman Optimality Backup
Now that we can calculate the value functions efficiently via the Bellman expectation recursions, we can now solve the MDP which requires maximize either of the two…

Policy Iteration
In this section we start developing dynamic programming algorithms that solve a perfectly known MDP. In the Bellman expectation backup section we have derived the equations…

Policy Iteration Gridworld
This notebook implements policy iteration for the classic 4x3 grid world example in Artificial Intelligence: A Modern Approach, Figure 17.2.
 
Policy Iteration
# Uncomment to run the code locally 
# !git clone https://github.com/dennybritz/reinforcement-learning.git reinforcement_learning
Cloning into…

Value Iteration
We already have seen that in the Gridworld example in the policy iteration section , we may not need to reach the optimal state value function v∗(s) to obtain an…

Recycling Robot: Value Iteration to Estimate q∗
The exerise 3.15 solution provides the closed form expressions for the bellman optimality equations for Q*.
 
Value Iteration Gridworld
Two

Value vs Policy Iteration for a trivial two state MDP
State | Action A | Action B s1s_1s1​ | s2s_2s2​ +2 | s1s_1s1​ +0 s2s_2s2​ | s1s_1s1​ +0 | s2s_2s2​ +0

Markov Decision Processes
Many of the algorithms presented here like policy and value iteration have been developed in older repos such as this and this. This site is being migrated to be compatible…

Introduction to MDP
We start by reviewing the agent-environment interface with this evolved notation and provide additional definitions that will help in grasping the concepts behind DRL. We…

agent-env-interface

Introduction to MDP
We start by reviewing the agent-environment interface with this evolved notation and provide additional definitions that will help in grasping the concepts behind DRL. We…
 
Jack’s Car Rental
https://github.com/zy31415/jackscarrental

Cleaning Robot - Deterministic MDP

Cleaning Robot - Stochastic MDP
The following code shows the estimation of the q value function for a policy, the optimal q_star and the optimal policy for the cleaning robot problem in the strochastic case.

Non-deterministic outcomes

Finding optimal policies in Gridworld

POMDP Example
source

recycling-robot-fsm

Applying the Bellman Optimality Backup
Finite State Machine of a a recycling robot and MDP dynamics LUT
No matching items
  • 1
  • 2

Reinforcement Learning
We started looking at different agent behavior architectures starting from the planning agents where the model of the environment is known and with no interaction with it…

Generalized Policy Iteration
As we saw in the dynamic programming (DP) solution MDP problem, policy iteration is an algorithm that consists of two simultaneous, interacting processes: one making the…

ϵ-greedy Monte-Carlo (MC) Control
In this section we outline methods that can result in optimal policies when the MDP is unknown and we need to learn its underlying functions / models - also known as the mode…

sarsa-gridworld

SARSA Gridworld Example
SARSA Gridworld

The SARSA Algorithm
SARSA implements a Q(s,a) value-based GPI and naturally follows as an enhancement from the ϵ−greedy policy improvement step of MC control.
 
Policy Gradient - Pong Game
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym

# hyperparameters
H = 200 #…

Policy Gradient Algorithms - REINFORCE
Given that RL can be posed as an MDP, in this section we continue with a policy-based algorithm that learns the policy directly by optimizing the objective function and can…

Monte-Carlo Prediction
In this chapter we find optimal policy solutions when the MDP is unknown and we need to learn its underlying value functions - also known as the model free prediction…

Example of Q(s,a) Prediction
Suppose an agent is learning to play the toy environment shown above. This is a essentially a corridor and the agent has to learn to navigate to the end of the corridor to…

MC vs. TD(0)
It is instructive to see the difference between MC and TD approaches in the following example.

Temporal Difference (TD) Prediction
If one had to identify one idea as central and novel to reinforcement learning, it would undoubtedly be temporal-difference(TD) learning. TD learning is a combination of…
No matching items
  • Edit this page
  • View source
  • Report an issue