update
and sample
methods. The underlying data format is numpy, which can efficiently be put into PyTorch tensors using shared memory via torch.from_numpy
. There are two types of memory in RL:<s, a, s', r>
, the state, action taken, next state, and reward received, as an experience in memory.memory.to_train
. That is, the memory class sets self.to_train = True
when it is time to train.