https://algorithmsbook.com/

MIT press provides another excellent book in creative commons.

I plan to buy it and I recommend you do. This book provides a broad introduction to algorithms for decision making under uncertainty.

The book takes an agent based approach

An agent is an entity that acts based on observations of its environment. Agents

may be physical entities, like humans or robots, or they may be nonphysical entities,

such as decision support systems that are implemented entirely in software.

The interaction between the agent and the environment follows an observe-act cycle or loop.

• The agent at time t receives an observation of the environment
• Observations are often incomplete or noisy;
• Based in the inputs, the agent then chooses an action at through some decision process.
• This action, such as sounding an alert, may have a nondeterministic effect on the environment.
• The book focusses on agents that interact intelligently to achieve their objectives over time.
• Given the past sequence of observations and knowledge about the environment, the agent must choose an action at that best achieves its objectives in the presence of various sources of uncertainty including:
1. outcome uncertainty, where the effects of our actions are uncertain,
2. model uncertainty, where our model of the problem is uncertain,
3. state uncertainty, where the true state of the environment is uncertain, and
3. interaction uncertainty, where the behavior of the other agents interacting in the environment is uncertain.

The book is organized around these four sources of uncertainty.

Making decisions in the presence of uncertainty is central to the field of artificial intelligence

Introduction

Decision Making

Applications

Methods

History

Societal Impact

Overview

PROBABILISTIC REASONING

Representation

Degrees of Belief and Probability

Probability Distributions

Joint Distributions

Conditional Distributions

Bayesian Networks

Conditional Independence

Summary

Exercises

viii contents

Inference

Inference in Bayesian Networks

Inference in Naive Bayes Models

Sum-Product Variable Elimination

Belief Propagation

Computational Complexity

Direct Sampling

Likelihood Weighted Sampling

Gibbs Sampling

Inference in Gaussian Models

Summary

Exercises

Parameter Learning

Maximum Likelihood Parameter Learning

Bayesian Parameter Learning

Nonparametric Learning

Learning with Missing Data

Summary

Exercises

Structure Learning

Bayesian Network Scoring

Directed Graph Search

Markov Equivalence Classes

Partially Directed Graph Search

Summary

Exercises

Simple Decisions

Constraints on Rational Preferences

Utility Functions

Utility Elicitation

Maximum Expected Utility Principle

Decision Networks

Value of Information

Irrationality

Summary

Exercises

SEQUENTIAL PROBLEMS

Exact Solution Methods

Markov Decision Processes

Policy Evaluation

Value Function Policies

Policy Iteration

Value Iteration

Asynchronous Value Iteration

Linear Program Formulation

Summary

Exercises

Approximate Value Functions

Parametric Representations

Nearest Neighbor

Kernel Smoothing

Linear Interpolation

Simplex Interpolation

Linear Regression

Neural Network Regression

Summary

Exercises

Online Planning

Receding Horizon Planning

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Heuristic Search

Labeled Heuristic Search

Open-Loop Planning

Summary

Exercises

Policy Search

Approximate Policy Evaluation

Local Search

Genetic Algorithms

Cross Entropy Method

Evolution Strategies

Isotropic Evolutionary Strategies

Summary

Exercises

Finite Difference

Likelihood Ratio

Reward-to-Go

Baseline Subtraction

Summary

Exercises

Trust Region Update

Clamped Surrogate Objective

Summary

Exercises

Actor-Critic Methods

Actor-Critic

Actor-Critic with Monte Carlo Tree Search

Summary

Policy Validation

Performance Metric Evaluation

Rare Event Simulation

Robustness Analysis

Summary

Exercises

MODEL UNCERTAINTY

Exploration and Exploitation

Bandit Problems

Bayesian Model Estimation

Undirected Exploration Strategies

Directed Exploration Strategies

Optimal Exploration Strategies

Exploration with Multiple States

Summary

Exercises

Model-Based Methods

Maximum Likelihood Models

Update Schemes

Exploration

Bayesian Methods

Posterior Sampling

Summary

Exercises

Model-Free Methods

Incremental Estimation of the Mean

Q-Learning

Sarsa

Eligibility Traces

Reward Shaping

Action Value Function Approximation

Experience Replay

Summary

Exercises

Imitation Learning

Behavioral Cloning

Dataset Aggregation

Stochastic Mixing Iterative Learning

Maximum Margin Inverse Reinforcement Learning

Maximum Entropy Inverse Reinforcement Learning

Summary

Exercises

PART IV STATE UNCERTAINTY

19 Beliefs 373

Belief Initialization

Discrete State Filter

Linear Gaussian Filter

Extended Kalman Filter

Unscented Kalman Filter

Particle Filter

Particle Injection

Summary

Exercises

20 Exact Belief State Planning 399

Belief-State Markov Decision Processes

Conditional Plans

Alpha Vectors

Pruning

Value Iteration

Linear Policies

Summary

Exercises

Offline Belief State Planning

Fully Observable Value Approximation

Fast Informed Bound

Fast Lower Bounds

Point-Based Value Iteration

Randomized Point-Based Value Iteration

Sawtooth Upper Bound

Point Selection

Sawtooth Heuristic Search

Triangulated Value Functions

Summary

Exercises

Online Belief State Planning

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Determinized Sparse Tree Search

Gap Heuristic Search

Summary

Exercises

Controller Abstractions

Controllers

Policy Iteration

Nonlinear Programming

Summary

Exercises

PART V MULTIAGENT SYSTEMS

Multiagent Reasoning

Simple Games

Response Models

Dominant Strategy Equilibrium

Nash Equilibrium

Correlated Equilibrium

Iterated Best Response

Hierarchical Softmax

Fictitious Play

Summary

Exercises

Sequential Problems

Markov Games

Response Models

Nash Equilibrium

Fictitious Play

Nash Q-Learning

Summary

Exercises

State Uncertainty

Partially Observable Markov Games

Policy Evaluation

Nash Equilibrium

Dynamic Programming

Summary

Exercises

Collaborative Agents

Decentralized Partially Observable Markov Decision Processes

Subclasses

Dynamic Programming

Iterated Best Response

Heuristic Search

Nonlinear Programming

Summary

Exercises

APPENDICES

Mathematical Concepts

Measure Spaces

Probability Spaces

Metric Spaces

Normed Vector Spaces

Positive Definiteness

Convexity

Information Content

Entropy

Cross Entropy

Relative Entropy

Taylor Expansion

Monte Carlo Estimation

Importance Sampling

Contraction Mappings

Graphs

Probability Distributions

Computational Complexity

Asymptotic Notation

Time Complexity Classes

Space Complexity Classes

Decideability

Neural Representations

Neural Networks

Feedforward Networks

Parameter Regularization

Convolutional Neural Networks

Recurrent Networks

Autoencoder Networks

Search Algorithms

Search Problems

Search Graphs

Forward Search

Branch and Bound

Dynamic Programming

Heuristic Search

Problems

Hex World

2048

Cart-Pole

Mountain Car

Simple Regulator

Aircraft Collision Avoidance

Crying Baby

Machine Replacement

Catch

F.10 Prisoners Dilemma

Rock-Paper-Scissors

Travelers Dilemma

Predator-Prey Hex World

Multi-Caregiver Crying Baby

Collaborative Predator-Prey Hex World

Julia

Types

Functions

Control Flow

Packages

Convenience Functions