Foundations of Deep Reinforcement Learning
  • Foundations of Deep Reinforcement Learning
  • Errata
    • Chapter 1 Introduction
    • Chapter 2 REINFORCE
    • Chapter 3 SARSA
    • Chapter 6 Advantage Actor-Critic
    • Chapter 7 PPO
    • Chapter 14 States
    • Appendix A Timeline
    • Appendix B Example Environments
  • Contact
Powered by GitBook
On this page
  • Page 347, B.1.1 CartPole-v0
  • Page 347, B.1.3 LunarLander-v2
  • Page 348, B.1.3 LunarLander-v2

Was this helpful?

  1. Errata

Appendix B Example Environments

Page 347, B.1.1 CartPole-v0

There is an error in the termination condition. The pole falls over when is it greater than 15 degrees from vertical, not 12 degrees.

Page 347, B.1.3 LunarLander-v2

There is an error in the objective definition. "Land the lander without crashing fuel" should read "Land the lander without crashing and conserve fuel."

Page 348, B.1.3 LunarLander-v2

The reward definition is not complete. The reward for the LunarLander-v2 environment is defined as follows: -100 for crashing, +100 for landing. -0.3 per time step when firing main engine, +100 to +140 for landing between the flags (landing pad) with zero speed, +10 for each leg ground contact.

PreviousAppendix A TimelineNextContact

Last updated 4 years ago

Was this helpful?