๐ง Algorithm Families
Overview
Algorithm Taxonomy
Algorithm (base class)
โโโ SARSA (tabular-like Q-learning)
โ โโโ VanillaDQN โ DQNBase โ DQN โ DoubleDQN
โโโ Reinforce (policy gradient)
โโโ ActorCritic (adds value function, GAE/n-step)
โโโ PPO (adds clipped objective)
โโโ SoftActorCritic (adds entropy regularization)Implemented Algorithms
Algorithm
Type
Action Space
Key Features
Algorithm Interface
Algorithm Spec
Key Parameters
Common Parameters
Parameter
Description
Typical Values
Policy Gradient Parameters (A2C, PPO)
Parameter
Description
Typical Values
PPO-Specific Parameters
Parameter
Description
Typical Values
DQN-Specific Parameters
Parameter
Description
Typical Values
Exploration Schedules
Example Specs
PPO for CartPole (Discrete)
DQN for LunarLander (Discrete)
SAC for MuJoCo (Continuous)
Adding a New Algorithm
Algorithm Performance Notes
Recommended by Environment
Environment Type
Best Algorithm
Notes
Known Limitations
Algorithm
Environment
Issue
Alternative
Lambda Tuning for Atari
Lambda
Best For
Examples
Learning Resources
Last updated
Was this helpful?