[email protected]{predir}
as shown in Lab Command, where {predir}
is the data directory of a previous training run, e.g. data/reinforce_cartpole_2020_04_13_232521
. A shorthand [email protected]
can also be used to automatically pick up from the latest training run of the specified spec to resume.[email protected]
mode resumes a training in a past-future-consistent manner, as explained below.load
methodsalgorithm.load()
: this already loads the algorithm and their model weights for enjoy mode, now it's used for [email protected]
modebody.train_df
: this object tracks the training metrics data, hence needs to be loadedenv.clock
: this tracks the time within the session.env.clock
, the above are all we need to restore for resuming training. Once the network and training metrics are restored, and the clock is set correctly, everything runs from the designated point in time.[email protected]{session_spec_file}
shown in Lab Command. The spec file for the session was saved automatically when it ran. The lab automatically finds and loads the model weights, which are saved in the model folder, e.g. data/reinforce_cartpole_2020_04_13_232521/model/
.session_spec_file
is data/reinforce_cartpole_2020_04_13_232521/reinforce_cartpole_t0_s0_spec.json
. To run enjoy mode, use the following command:_ckpt-best
in the model files. These files are saved from the best evaluation checkpoints (determined by total_reward_ma
).total_reward_ma
starting at 200 (the maximum score) immediately: