Public Benchmark Data
SLM Lab provides a set of benchmark results that are periodically updated with new feature releases. All the result data is uploaded from a Pull Request and made public on Google Drive.
Public benchmark data has been moved from Dropbox to Google Drive as of Aug 2022.
For reference, the image based environment benchmarks are run on AWS GPU box
p2.16xlarge
, and the non-image based environments are run on AWS CPU box m5a.24xlarge
.The benchmark tables in this page show the
Trial
level final_return_ma
from SLM Lab. This is final value of the 100-ckpt moving average of the return (total rewards) from evaluation. Each Trial
is ran with 4 Session
s with different random seeds, and their final_return_ma
are averaged on the Trial
level.The specs for these are contained in the
slm_lab/spec/benchmark
folder, descriptively named {algorithm}_{environment}.json
. They can be exactly reproduced as described in Lab Organization.SLM Lab's benchmark includes environments from the following offerings:
Deep RL algorithms use a lot of abbreviations. Here's a list to help us navigate:
- A2C (GAE): Advantage Actor-Critic with GAE as advantage estimation
- A2C (n-step): Advantage Actor-Critic with n-step return as advantage estimation
- A3C: Asynchronous Advantage Actor-Critic
- CER: Combined Experience Replay
- DDQN: Double Deep Q-Network
- Async: Asynchronous
- DQN: Deep Q-Network
- GAE: Generalized Advantage Estimation
- PER: Prioritized Experience Replay
- PPO: Proximal Policy Optimization
- SAC: Soft Actor-Critic
- SIL: Self Imitation Learning
Read on to see the benchmark result tables and plots.