SLM Lab
v4.1.1
v4.1.1
  • SLM Lab
  • 🖥Setup
    • Installation
    • Quick Start
  • 🚀Using SLM Lab
    • Lab Command
    • Lab Organization
    • Train and Enjoy: REINFORCE CartPole
    • Agent Spec: DDQN+PER on LunarLander
    • Env Spec: A2C on Pong
    • GPU Usage: PPO on Pong
    • Parallelizing Training: Async SAC on Humanoid
    • Experiment and Search Spec: PPO on Breakout
    • Run Benchmark: A2C on Atari Games
    • Meta Spec: High Level Specifications
    • Post-Hoc Analysis
    • TensorBoard: Visualizing Models and Actions
    • Using SLM Lab In Your Project
  • 📈Analyzing Results
    • Data Locations
    • Graphs and Data
    • Performance Metrics
  • 🥇Benchmark Results
    • Public Benchmark Data
    • Discrete Environment Benchmark
    • Continuous Environment Benchmark
    • Atari Environment Benchmark
    • RL GIFs
  • 🔧Development
    • Modular Design
      • Algorithm Taxonomy
      • Class Inheritance: A2C > PPO
    • Algorithm
      • DQN
      • REINFORCE
      • Actor Critic
    • Memory
      • Replay
      • PrioritizedReplay
      • OnPolicyReplay
      • OnPolicyBatchReplay
    • Net
      • MLP
      • CNN
      • RNN
    • Profiling SLM Lab
  • 📖Publications and Talks
    • Book: Foundations of Deep Reinforcement Learning
    • Talks and Presentations
  • 🤓Resources
    • Deep RL Resources
    • Contributing
    • Motivation
    • Help
    • Contact
Powered by GitBook
On this page
  • Discrete Environment Benchmark Result
  • Discrete Environment Benchmark Result Plots

Was this helpful?

  1. 🥇Benchmark Results

Discrete Environment Benchmark

PreviousPublic Benchmark DataNextContinuous Environment Benchmark

Last updated 5 years ago

Was this helpful?

Discrete Environment Benchmark Result

Env. \ Alg.

DQN

DDQN+PER

A2C (GAE)

A2C (n-step)

PPO

SAC

Breakout

80.88

182

377

398

443

3.51*

Pong

18.48

20.5

19.31

19.56

20.58

19.87*

Qbert

5494

11426

12405

13590

13460

923*

Seaquest

1185

4405

1070

1684

1715

171*

LunarLander

192

233

25.21

68.23

214

276

UnityHallway

-0.32

0.27

0.08

-0.96

0.73

0.01

UnityPushBlock

4.88

4.93

4.68

4.93

4.97

-0.70

Episode score at the end of training attained by SLM Lab implementations on discrete-action control problems. Reported episode scores are the average over the last 100 checkpoints, and then averaged over 4 Sessions. A Random baseline with score averaged over 100 episodes is included. Results marked with * were trained using the hybrid synchronous/asynchronous version of SAC to parallelize and speed up training time. For SAC, Breakout, Pong and Seaquest were trained for 2M frames instead of 10M frames.

For the full Atari benchmark, see

Discrete Environment Benchmark Result Plots

Plot Legend

🥇
📈
Upload PR #427
Dropbox data
Atari Benchmark
legend