Engineering AI Agents
  • BOOK
    • Foundations
    • Training Deep Networks
    • Perception
    • State Estimation
    • Large Language Models
    • Multimodal Reasoning
    • Planning
    • Markov Decision Processes
    • Reinforcement Learning
  • COURSES
    • Introduction to AI
    • AI for Robotics
    • Deep Learning for Computer Vision
    • DATA MINING - BEING PORTED
  • ABOUT ME

Welcome !

Learn the concepts and engineer AI agents with real-time perceptive and language understanding abilities.

  • Use Jupyter notebooks to learn the concepts from scratch.
  • Simulate AI agents with egomotin using the Robotic Operating System (ROS2).
  • Build real-time perception pipelines.
  • Use AIOps tools to engineer data pipelines that create large training datasets for LLMs & LVMs.

Start Here

Topics

  • Foundations
  • Perception
  • LLMs
  • Logical Reasoning
  • Planning
  • Acting
  • Reinforcement Learning

ai agents

AI Agents
Agents

We will cover the agent-environment interface, the rational, learning agent architectures and a practical example of a robotic agent.

Pantelis Monogioudis
Oct 20, 2022

A systems approach to AI
As its evident from all existing approaches towards AI, multidisciplinary science that aims to create agents that can think and act humanly or rationally. This course starts…

The four approaches towards AI
A 5-min behavioral intelligence test, where an interrogator chats with the player and at the end it guesses if the conversation is with a human or with a programmed machine.…

nautical analogy

Rules, rule the world
Agents

We provide a historical perspective on AI development and the role of rules in mission-critical systems.

Pantelis Monogioudis
Oct 20, 2021

Data Science 360
What are the disciplines that we need to cross fertilize to get a system that possesses intelligence?

ai architecture

The four approaches towards AI
Agents

Ultimately AI will be a cloud of reasoning systems.

Pantelis Monogioudis
Feb 21, 2022

The Learning Problem
Statistical Learning Theory

The supervised learning problem statement.

No matching items

classification-detection

Introduction to Scene Understanding
In the previous chapters we have treated the perception subsystem mainly from starting the first principles that govern supervised learning to the deep learning…
 
Mask R-CNN Semantic Segmentation
The semantic segmentation approach described in this section is Mask R-CNN paper. Mask R-CNN is an extension of Faster R-CNN that adds a mask head to the detector. The mask…

Detectron2 Beginner’s Tutorial
Welcome to detectron2! This is the official colab tutorial of detectron2. Here, we will go through some basics usage of detectron2, including the following: * Run inference…

Mask R-CNN: A detailed guide with Detectron2:
Welcome to the Mask R-CNN with Detectron2 tutorial!

MaskRCNN Inference
import torch
from torchvision import datasets, transforms, models, ops, io
from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor
from torchvision.models.detec…

Mask R-CNN Demo
A quick intro to using the pre-trained model to detect and segment objects.

Mask R-CNN - Inspect Training Data
Inspect and visualize data loading and pre-processing code.

Mask R-CNN - Inspect Trained Model
Code and visualizations to test, debug, and evaluate the Mask R-CNN model.

Mask R-CNN - Inspect Weights of a Trained Model
This notebook includes code and visualizations to test, debug, and evaluate the Mask R-CNN model.
 
UNet Semantic Segmentation
No matching items

Finetuning Language Models for Text Classification - Patent Dataset
This notebook was submitted by NYU student Sky Achitoff
Pantelis Monogioudis

national-library-greece

Natural Language Processing
“You shall know a word by the company it keeps” (J. R. Firth 1957: 11) - many modern discoveries are in fact rediscoveries from other works sometimes decades old. NLP is…
 
Language Models Workshop
The following notebook is a from-scratch attempt on character-level language modeling. Its instructive for you to go through it first and then go through the corresponding…
 
CNN Language Model
The following was developed by Harini Appansrinivasan, NYU as part of an assignment submission.

language-model-google-search

Language Models
These notes heavily borrowing from the CS224N set of notes on Language Models.
 
LLM Inference

LSTM Language Model from scratch
This notebook was borrowed from Christina Kouridis’ github. The notation is different than the notation used in the LTSM section of the notes and will be changed in a next…

rnn-language-model

RNN Language Models
When we focus on making predictions based on a fixed window of context (i.e. the \(n\) previous words), in some cases, the window may not be sufficient to capture the…
 
Example of an RNN Language Model
Our aim is to predict the next character given a set of previous characters from our data string. For our RNN implementation, we would take a sequence of length 25…

pos-classification

Introduction to NLP Pipelines
In this chapter, we will introduce the topic of processing with neural architectures language in general. This includes natural language, code etc.

Text Tokenization
In earlier chapters we have limited the discussion to tokenizers that either produce a list of words or a list of characters. Its very important though to understand the…

distributional-similarity

Word2Vec Embeddings {#sec-word2vec}
In the so called classical NLP, words were treated as atomic symbols, e.g. hotel, conference, walk and they were represented with on-hot encoded (sparse) vectors e.g.
 
Word2Vec from scratch
This self-contained implementation is instructive and you should go through it to understand the word2vec embedding.
 
Word2Vec Tensorflow Tutorial
word2vec is not a singular algorithm, rather, it is a family of model architectures and optimizations that can be used to learn word embeddings from large datasets.…

rosetta-stone

RNN-based Neural Machine Translation
These notes heavily borrowing from the CS229N 2019 set of notes on NMT.

The BLEU Score
In 2002, IBM researchers developed the BiLingual Evaluation Understudy (BLEU) metric that remains, with its many variants to this day, one of the most quoted metrics for…

Attention in RNN-based NMT
When you hear the sentence “the soccer ball is on the field,” you don’t assign the same importance to all 7 words. You primarily take note of the words “ball” “on,” and “field…
 
Character-level recurrent sequence-to-sequence model
Author: fchollet
Date created: 2017/09/29
Last modified: 2020/04/26
Description: Character-level recurrent sequence-to-sequence model.

The Annotated Transformer
Attention is All You Need
 
Understanding the Division by √d in the Attention Mechanism
In this notebook, we explore why the dot-product attention mechanism includes a scaling factor of $ $. We use an example with embedding dimension $d = 4 $, sequence length…

Multi-head self-attention
Earlier we have seen examples with the token bear being in multiple grammatical patterns that also influence its meaning. To capture such multiplicity we can use multiple…
 
Positional Embeddings
In the RNN architectures,the decoder state at time step \(t\) was a function of the decoder state at time step \(t-1\) and the input token at time step \(t\). In other…

Scaling
import numpy as np
import matplotlib.pyplot as plt

# Creating an 8-element numpy vector with random gaussian values
# vector = np.random.randn(8)
vector = np.array([0.17148…

Single-head self-attention
In the simple attention mechanism, the attention weights are computed deterministically from the input context. We call the combination of context-free embedding (eg…

Transformers and Self-Attention
For the explanation of decoder-based architectures such as those used by GPT, please see the repo https://github.com/pantelis/femtotransformers and the embedded comments…
No matching items

    abandoned-backpack

    Automated Reasoning
    We have seen in an earlier chapter where we introduced a dynamical system governing the state evolution of the environment that a state is composed of variables and such fact…
     
    Logical Agents
    In this chapter we see how agents equipped with the ability to represent internally the state of the environment and reason about the effectiveness of possible actions using…

    wumpus-entailment

    Logical Inference
    The wumpus world despite its triviality, contains some deeper abstractions that are worth summarizing.

    wumpus-world

    World Models
    For each problem we can define a number of world models each representing every possible state (configuration) of the environment that the agent may be in.
    No matching items

    Automated Planning
    Planning combines two major areas of AI: logic and search.

    Specifying the engine name
    The unified_planning.plot package provides useful functions to visually plot many objects

    The Unified Planning Library
    In this demo we will scratch the surface of the UP: we will set it up, we manually create the blocksworld domain using the UP API, we create a problem using a bit of python…

    Planning Domain Definition Language (PDDL)
    In the chapter of propositional logic we have seen the combinatorial explosion problem that results from the need to include in the reasoning / inference step all possible…
     
    Logistics Planning in PDDL
    We will use the Logistics domain to illustrate how to represent a planning task in PDDL.

    Manufacrturing Robot Planning in PDDL
    This is a real case that we tackled for a manufacturing company. This company devises supply chains to make pieces of medical equipments. A supply chain consists of…

    The A* Algorithm
    Dijkstra’s algorithm is very much related to the Uniform Cost Search algorithm and in fact logically they are equivalent as the algorithm explores uniformly all nodes that…

    depth-first

    Forward Search Algorithms
    If you are missing some algorithmic background, afraid not. There is a free and excellent book to help you with the background behind this chapter. Read Chapters 3 and 4 for…

    Planning with Search
    In the PDDL section we saw that a sequence of actions that the agent needs to execute to reach the goal can be obtained using domain-independent planners. This section…

    \(A^*\) Interactive Demo
    This demo is instructive of the various search algorithms we will cover here. You can introduce using your mouse obstacles in the canvas and see how the various search…
    No matching items

    state-value-tree

    Bellman Expectation Backup
    In this section we describe how to calculate the value functions by establishing a recursive relationship similar to the one we did for the return. We replace the…

    Bellman Optimality Backup
    Now that we can calculate the value functions efficiently via the Bellman expectation recursions, we can now solve the MDP which requires maximize either of the two…

    Policy Iteration
    In this section we start developing dynamic programming algorithms that solve a perfectly known MDP. In the Bellman expectation backup section we have derived the equations…

    Policy Iteration Gridworld
    This notebook implements policy iteration for the classic 4x3 grid world example in Artificial Intelligence: A Modern Approach, Figure 17.2.
     
    Policy Iteration
    # Uncomment to run the code locally 
    # !git clone https://github.com/dennybritz/reinforcement-learning.git reinforcement_learning
    Cloning into…

    Value Iteration
    We already have seen that in the Gridworld example in the policy iteration section , we may not need to reach the optimal state value function \(v_*(s)\) to obtain an…
     
    Value Iteration Gridworld
    Two

    Value vs Policy Iteration for a trivial two state MDP
    State | Action A | Action B s1s_1s1​ | s2s_2s2​ +2 | s1s_1s1​ +0 s2s_2s2​ | s1s_1s1​ +0 | s2s_2s2​ +0

    Markov Decision Processes
    Many of the algorithms presented here like policy and value iteration have been developed in older repos such as this and this. This site is being migrated to be compatible…

    Introduction to MDP
    We start by reviewing the agent-environment interface with this evolved notation and provide additional definitions that will help in grasping the concepts behind DRL. We…

    agent-env-interface

    Introduction to MDP
    We start by reviewing the agent-environment interface with this evolved notation and provide additional definitions that will help in grasping the concepts behind DRL. We…
     
    Jack’s Car Rental
    https://github.com/zy31415/jackscarrental

    Cleaning Robot - Deterministic MDP

    Cleaning Robot - Stochastic MDP
    The following code shows the estimation of the q value function for a policy, the optimal q_star and the optimal policy for the cleaning robot problem in the strochastic case.

    Non-deterministic outcomes

    Finding optimal policies in Gridworld

    POMDP Example
    source

    recycling-robot-fsm

    Applying the Bellman Optimality Backup
    Finite State Machine of a a recycling robot and MDP dynamics LUT

    marginal-value-multi-class

    Optimal Capacity Control
    In this section we outline a capacity control policy that is routinely used in various industries (airlines, car rentals, hospitality) to make reservations towards a…
     
    # aima_gridworld_env.py
    
    import gymnasium as gym
    from gymnasium import spaces
    from minigrid.core.grid import Grid
    from minigrid.minigrid_env import MiniGridEnv
    
    class AIMAGr…

    policy-evaluation-tree

    Policy Evaluation (Prediction)
    The policy \(\pi\) is evaluated when we have produced the state-value function \(v_\pi(s)\) for all states. In other words when we know the expected discounted returns that…
     
    Policy Improvement (Control)
    In the policy improvement step we are given the value function and simply apply the greedy heuristic to it.
    No matching items

      Reinforcement Learning
      We started looking at different agent behavior architectures starting from the planning agents where the model of the environment is known and with no interaction with it…

      Generalized Policy Iteration
      As we saw in the dynamic programming (DP) solution MDP problem, policy iteration is an algorithm that consists of two simultaneous, interacting processes: one making the…

      \(\epsilon\)-greedy Monte-Carlo (MC) Control
      In this section we outline methods that can result in optimal policies when the MDP is unknown and we need to learn its underlying functions / models - also known as the mode…

      sarsa-gridworld

      SARSA Gridworld Example
      SARSA Gridworld

      The SARSA Algorithm
      SARSA implements a \(Q(s,a)\) value-based GPI and naturally follows as an enhancement from the \(\epsilon-greedy\) policy improvement step of MC control.
       
      Policy Gradient - Pong Game
      """ Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
      import numpy as np
      import cPickle as pickle
      import gym
      
      # hyperparameters
      H = 200 #…

      Policy Gradient Algorithms - REINFORCE
      Given that RL can be posed as an MDP, in this section we continue with a policy-based algorithm that learns the policy directly by optimizing the objective function and can…

      Monte-Carlo Prediction
      In this chapter we find optimal policy solutions when the MDP is unknown and we need to learn its underlying value functions - also known as the model free prediction…

      Example of \(Q(s,a)\) Prediction
      Suppose an agent is learning to play the toy environment shown above. This is a essentially a corridor and the agent has to learn to navigate to the end of the corridor to…

      MC vs. TD(0)
      It is instructive to see the difference between MC and TD approaches in the following example.

      Temporal Difference (TD) Prediction
      If one had to identify one idea as central and novel to reinforcement learning, it would undoubtedly be temporal-difference(TD) learning. TD learning is a combination of…
      No matching items
      • Edit this page
      • View source
      • Report an issue