Sessionsof different random seeds to average the results. Wait for it to run until completion, which should take about 10-20 minutes. Meanwhile, check the metrics logged in the terminal. In particular, the
total_rewardand its moving average
total_reward_ma(with a window of 100 episodes) should climb up gradually.
data/reinforce_cartpole_2020_04_13_232521/. Among other things, SLM Lab also automatically saves the final and the best model files in the model folder. The model files can be used for easy playback in enjoy mode.