SLM Lab
v4.1.1
v4.1.1
  • SLM Lab
  • 🖥Setup
    • Installation
    • Quick Start
  • 🚀Using SLM Lab
    • Lab Command
    • Lab Organization
    • Train and Enjoy: REINFORCE CartPole
    • Agent Spec: DDQN+PER on LunarLander
    • Env Spec: A2C on Pong
    • GPU Usage: PPO on Pong
    • Parallelizing Training: Async SAC on Humanoid
    • Experiment and Search Spec: PPO on Breakout
    • Run Benchmark: A2C on Atari Games
    • Meta Spec: High Level Specifications
    • Post-Hoc Analysis
    • TensorBoard: Visualizing Models and Actions
    • Using SLM Lab In Your Project
  • 📈Analyzing Results
    • Data Locations
    • Graphs and Data
    • Performance Metrics
  • 🥇Benchmark Results
    • Public Benchmark Data
    • Discrete Environment Benchmark
    • Continuous Environment Benchmark
    • Atari Environment Benchmark
    • RL GIFs
  • 🔧Development
    • Modular Design
      • Algorithm Taxonomy
      • Class Inheritance: A2C > PPO
    • Algorithm
      • DQN
      • REINFORCE
      • Actor Critic
    • Memory
      • Replay
      • PrioritizedReplay
      • OnPolicyReplay
      • OnPolicyBatchReplay
    • Net
      • MLP
      • CNN
      • RNN
    • Profiling SLM Lab
  • 📖Publications and Talks
    • Book: Foundations of Deep Reinforcement Learning
    • Talks and Presentations
  • 🀓Resources
    • Deep RL Resources
    • Contributing
    • Motivation
    • Help
    • Contact
Powered by GitBook
On this page
  • Session, Trial, and Experiment
  • Reproducibility Design

Was this helpful?

  1. 🚀Using SLM Lab

Lab Organization

PreviousLab CommandNextTrain and Enjoy: REINFORCE CartPole

Last updated 5 years ago

Was this helpful?

Session, Trial, and Experiment

SLM Lab is organized in the following hierarchy:

  • Session: The lowest level of SLM Lab: a single training run of one agent on one environment with one set of hyperparameters, all with a fixed random seed.

  • Trial: A trial consists of multiple Sessions, with the Sessions varying only in the random seed.

  • Experiment: Generates different sets of hyperparameters (according to a spec file) and runs a Trial for each one. It can be thought of as a study, e.g. “What values of n of A2C n-step returns provide the fastest, most stable solution, if the other variables are held constant?”

When using the lab command, different lab modes also correspond to different lab level:

  • enjoy: Session

  • dev, train: Trial

  • search: Experiment

SLM Lab automatically produces plots for Sessions, Trials, and Experiments for any combination of environments and algorithms. It also logs and tracks metrics during training such as rewards, loss, exploration and entropy variables, model weights and biases, action distributions, frames-per-second and wall-clock time. The metrics are also visualized using TensorBoard.

Hyperparameter search is implemented using Ray Tune, and the results are automatically analyzed and presented hierarchically in increasingly granular detail.

SLM Lab is designed for reproducible deep RL. It achieves this by using two simple components:

  • spec file which defines the spec - fully exposed hyperparameters organized logically by the components

  • git SHA that is saved with a spec file to specify the version of the code used to run it

In SLM Lab, every configurable hyperparameter for an algorithm is specified in a spec file. When the lab runs, it saves the spec file along with the git SHA as part of its output. This contains all the information required to reproduce a Session, Trial, or Experiment.

This implies that reproducing the entirety of an RL experiment merely requires checking out the code at the git SHA and running the saved spec file.

Next, we will go through a series of tutorials to learn the various use cases of SLM Lab.

Reproducibility Design

🔁
📂
The graphs for Session, Trial, and Experiment.