Engineering AI Agents
BOOK
Foundations
Training Deep Networks
Perception
Kinematics
State Estimation
Large Language Models
Multimodal Reasoning
Task Planning
Global Planning
Local Planning
Markov Decision Processes
Reinforcement Learning
VLA Agents
COURSES
Introduction to AI
AI for Robotics
Deep Learning for Computer Vision
DATA MINING - BEING PORTED
MEDIA
AI for Robotics
ABOUT ME
Multimodal Reasoning
LLaVa Paper
Multimodal Reasoning
Text Tokenization
Word2Vec Embeddings
Word2Vec from scratch
Word2Vec Tensorflow Tutorial
Language Models
Transformers and Self-Attention
Single-head self-attention
Multi-head self-attention
Positional Embeddings
Batch Normalization
Layer Normalization (LN)
Vision Transformer Paper
Vision Transformer (ViT) in PyTorch
Contrastive Language-Image Pretraining (CLIP)
CLIP Paper
BLIP-2
BLIP-2 Paper
Visual Instruction Tuning - LlaVa
LLaVa Paper
Multimodal Reasoning
LLaVa Paper
LLaVa Paper
Visual Instruction Tuning - LlaVa