Meta Spec: High Level Specifications
Last updated
Was this helpful?
Last updated
Was this helpful?
In this tutorial we look at how to adjust the checkpointing frequency.
The meta spec is used to specify higher level configuration that don't fit within specific trial or session, such as how the experiments and trials should behave. It is specified using the meta key in a spec file with the following format:
We have already encountered some of the meta spec hyperparameters. Hogwild can be enabled using "distributed". We can also adjust the checkpoint mode and frequency. The max session and trials (when running an experiment) are specified here. Finally, although an experiment will automatically assign resources for each trial, advanced users can also configure the resource allocation for Ray.tune through "num_cpus" and "num_gpus".
Atari games usually have multiple lives per episode. During training, we split it up and treat each life as an episode to encourage the agent to appreciate all its lives. This split also implies that total rewards tracked during training need to be summed up over the lives to yield the true episodic rewards for evaluation.
One method for evaluation is to spawn a new vector environments, say using "rigorous_eval": 8 and run them in evaluation mode so they do not split up an episode. However, this can be quite slow and expensive to run.
Note that "eval_frequency" (evaluation checkpointing) and "log_frequency" (training checkpointing) are independent. Since evaluation needs multiple lives to tally the episodic rewards, it is usually set higher than "log_frequency".
Thankfully, SLM Lab has a preprocessor which tracks the true episodic rewards for Atari environments at . This means we can use the true episodic rewards from the training environment and run a trial much faster. To do so, disable the rigorous eval, and set the evaluation frequency to a desired value, as shown below.