๐Ÿ‘พAtari Benchmark

A2C & PPO Atari Results (v5)

SLM Lab v5 validates A2C and PPO on ALE (Arcade Learning Environment)arrow-up-right environments. The ALE provides 50+ classic Atari 2600 games as standardized RL benchmarks.

MsPacman
Breakout
Qbert
BeamRider

MsPacman

Breakout

Qbert

BeamRider

54 games tested with all results available on HuggingFacearrow-up-right.

circle-exclamation

Methodology

Results show Trial-level performance:

  1. Trial = 4 Sessions with different random seeds

  2. Session = One complete training run

  3. Score = Final 100-checkpoint moving average (total_reward_ma)

The trial score is the mean across 4 sessions, providing statistically meaningful results.

Configuration

Settings: max_frame 10e6 | num_envs 16 | max_session 4 | log_frequency 10000

Algorithm Specs (all use Nature CNN [32,64,64] + 512fc):

Environment: Gymnasium ALE v5 with life_loss_info=true, sticky actions (repeat_action_probability=0.25)

PPO Lambda Variants

Different games benefit from different lambda values for GAE. All variants use the same spec file:

SPEC_NAME
Lambda
Best for

ppo_atari

0.95

Strategic games (default)

ppo_atari_lam85

0.85

Mixed games

ppo_atari_lam70

0.70

Action games

chevron-rightLambda Comparison Table - click to expandhashtag

Shows scores for all three lambda variants where tested. Bold = best score, - = not tested.

Game
ppo_atari (0.95)
ppo_atari_lam85 (0.85)
ppo_atari_lam70 (0.70)

ALE/AirRaid-v5

8245

-

-

ALE/Alien-v5

1453

1353

1274

ALE/Amidar-v5

574

580

-

ALE/Assault-v5

4059

4293

3314

ALE/Asterix-v5

2967

3482

-

ALE/Asteroids-v5

1497

1554

-

ALE/Atlantis-v5

792886

754k

710k

ALE/BankHeist-v5

1045

1045

-

ALE/BattleZone-v5

21270

26383

13857

ALE/BeamRider-v5

2765

-

-

ALE/Berzerk-v5

1072

-

-

ALE/Bowling-v5

46.45

-

-

ALE/Boxing-v5

91.17

-

-

ALE/Breakout-v5

191

292

327

ALE/Carnival-v5

3071

3013

3967

ALE/Centipede-v5

3917

-

4915

ALE/ChopperCommand-v5

5355

-

-

ALE/CrazyClimber-v5

107183

107370

-

ALE/Defender-v5

37162

-

51439

ALE/DemonAttack-v5

7755

-

16558

ALE/DoubleDunk-v5

-2.38

-

-

ALE/ElevatorAction-v5

5446

363

3933

ALE/Enduro-v5

414

898

872

ALE/FishingDerby-v5

22.80

27.10

-

ALE/Freeway-v5

31.30

-

-

ALE/Frostbite-v5

301

275

267

ALE/Gopher-v5

4172

-

6508

ALE/Gravitar-v5

599

253

145

ALE/Hero-v5

21052

28238

-

ALE/IceHockey-v5

-3.93

-5.58

-7.36

ALE/Jamesbond-v5

662

-

-

ALE/JourneyEscape-v5

-1582

-1252

-1547

ALE/Kangaroo-v5

2623

9912

-

ALE/Krull-v5

7841

-

-

ALE/KungFuMaster-v5

18973

28334

29068

ALE/MsPacman-v5

2308

2372

2297

ALE/NameThisGame-v5

5993

-

-

ALE/Phoenix-v5

7940

-

15659

ALE/Pong-v5

15.01

16.91

12.85

ALE/Pooyan-v5

4704

-

5716

ALE/Qbert-v5

15094

-

-

ALE/Riverraid-v5

7319

9428

-

ALE/RoadRunner-v5

24204

37015

-

ALE/Robotank-v5

20.07

8.24

2.59

ALE/Seaquest-v5

1796

-

-

ALE/Skiing-v5

-19340

-22980

-29975

ALE/Solaris-v5

2094

-

-

ALE/SpaceInvaders-v5

726

-

-

ALE/StarGunner-v5

31862

-

47495

ALE/Surround-v5

-2.52

-

-6.79

ALE/Tennis-v5

-7.66

-4.41

-

ALE/TimePilot-v5

4668

-

-

ALE/Tutankham-v5

203

217

-

ALE/UpNDown-v5

182472

-

-

ALE/VideoPinball-v5

31385

-

56746

ALE/WizardOfWor-v5

5814

5466

4740

ALE/YarsRevenge-v5

17120

-

-

ALE/Zaxxon-v5

10756

-

-

Running Benchmarks

All games use the same spec file with variable substitution for the environment.

Remote (recommended) - cloud GPU via dstackarrow-up-right, auto-syncs to HuggingFace:

Remote setup: cp .env.example .env then set HF_TOKEN. See Remote Training for dstack config.

Local - runs on your machine (requires GPU, ~2-3 hours per game):

circle-exclamation

Download and Replay


Results

ENV
Score
SPEC_NAME
HF Repo

Skipped (hard exploration): Adventure, MontezumaRevenge, Pitfall, PrivateEye, Venture

Training Curves

Multi-trial comparison plots showing A2C vs PPO mean returns (moving average) vs training frames. Shaded regions show standard deviation across 4 sessions.

AirRaid

Alien

Amidar

Assault

Asterix

Asteroids

Atlantis

BankHeist

BattleZone

BeamRider

Berzerk

Bowling

Boxing

Breakout

Carnival

Centipede

ChopperCommand

CrazyClimber

Defender

DemonAttack

DoubleDunk

ElevatorAction

Enduro

FishingDerby

Freeway

Frostbite

Gopher

Gravitar

Hero

IceHockey

Jamesbond

JourneyEscape

Kangaroo

Krull

KungFuMaster

MsPacman

NameThisGame

Phoenix

Pong

Pooyan

Qbert

Riverraid

RoadRunner

Robotank

Seaquest

Skiing

Solaris

SpaceInvaders

StarGunner

Surround

Tennis

TimePilot

Tutankham

UpNDown

VideoPinball

WizardOfWor

YarsRevenge

Zaxxon


Historical Results (v4)

chevron-rightOpenAI Gym Atari Results (v4) - click to expandhashtag
circle-exclamation
Env. \ Alg.
DQN
DDQN+PER
A2C (GAE)
A2C (n-step)
PPO

Adventure

-0.94

-0.92

-0.77

-0.85

-0.3

AirRaid

1876

3974

4202

3557

4028

Alien

822

1574

1519

1627

1413

Amidar

90.95

431

577

418

795

Assault

1392

2567

3366

3312

3619

Asterix

1253

6866

5559

5223

6132

Asteroids

439

426

2951

2147

2186

Atlantis

68679

644810

2747371

2259733

2148077

BankHeist

131

623

855

1170

1183

BattleZone

6564

6395

4336

4533

13649

BeamRider

2799

5870

2659

4139

4299

Berzerk

319

401

1073

763

860

Bowling

30.29

39.5

24.51

23.75

31.64

Boxing

72.11

90.98

1.57

1.26

96.53

Breakout

80.88

182

377

398

443

Carnival

4280

4773

2473

1827

4566

Centipede

1899

2153

3909

4202

5003

ChopperCommand

1083

4020

3043

1280

3357

CrazyClimber

46984

88814

106256

109998

116820

Defender

281999

313018

665609

657823

534639

DemonAttack

1705

19856

23779

19615

121172

DoubleDunk

-21.44

-22.38

-5.15

-13.3

-6.01

ElevatorAction

32.62

17.91

9966

8818

6471

Enduro

437

959

787

0.0

1926

FishingDerby

-88.14

-1.7

16.54

1.65

36.03

Freeway

24.46

30.49

30.97

0.0

32.11

Frostbite

98.8

2497

277

261

1062

Gopher

1095

7562

929

1545

2933

Gravitar

87.34

258

313

433

223

Hero

1051

12579

16502

19322

17412

IceHockey

-14.96

-14.24

-5.79

-6.06

-6.43

Jamesbond

44.87

702

521

453

561

JourneyEscape

-4818

-2003

-921

-2032

-1094

Kangaroo

1965

8897

67.62

554

4989

Krull

5522

6650

7785

6642

8477

KungFuMaster

2288

16547

31199

25554

34523

MontezumaRevenge

0.0

0.02

0.08

0.19

1.08

MsPacman

1175

2215

1965

2158

2350

NameThisGame

3915

4474

5178

5795

6386

Phoenix

2909

8179

16345

13586

30504

Pitfall

-68.83

-73.65

-101

-31.13

-35.93

Pong

18.48

20.5

19.31

19.56

20.58

Pooyan

1958

2741

2862

2531

6799

PrivateEye

784

303

93.22

78.07

50.12

Qbert

5494

11426

12405

13590

13460

Riverraid

953

10492

8308

7565

9636

RoadRunner

15237

29047

30152

31030

32956

Robotank

3.43

9.05

2.98

2.27

2.27

Seaquest

1185

4405

1070

1684

1715

Skiing

-14094

-12883

-19481

-14234

-24713

Solaris

612

1396

2115

2236

1892

SpaceInvaders

451

670

733

750

797

StarGunner

3565

38238

44816

48410

60579

Tennis

-23.78

-10.33

-22.42

-19.06

-11.52

TimePilot

2819

1884

3331

3440

4398

Tutankham

35.03

159

161

175

211

UpNDown

2043

11632

89769

18878

262208

Venture

4.56

9.61

0.0

0.0

11.84

VideoPinball

8056

79730

35371

40423

58096

WizardOfWor

869

328

1516

1247

4283

YarsRevenge

5816

15698

27097

11742

10114

Zaxxon

442

54.28

64.72

24.7

641

The table above presents results for 62 Atari games. All agents were trained for 10M frames (40M including skipped frames). Reported results are the episode score at the end of training, averaged over the previous 100 evaluation checkpoints with each checkpoint averaged over 4 Sessions.

Last updated

Was this helpful?