SLM Lab
v4.1.1
v4.1.1
  • SLM Lab
  • 🖥Setup
    • Installation
    • Quick Start
  • 🚀Using SLM Lab
    • Lab Command
    • Lab Organization
    • Train and Enjoy: REINFORCE CartPole
    • Agent Spec: DDQN+PER on LunarLander
    • Env Spec: A2C on Pong
    • GPU Usage: PPO on Pong
    • Parallelizing Training: Async SAC on Humanoid
    • Experiment and Search Spec: PPO on Breakout
    • Run Benchmark: A2C on Atari Games
    • Meta Spec: High Level Specifications
    • Post-Hoc Analysis
    • TensorBoard: Visualizing Models and Actions
    • Using SLM Lab In Your Project
  • 📈Analyzing Results
    • Data Locations
    • Graphs and Data
    • Performance Metrics
  • 🥇Benchmark Results
    • Public Benchmark Data
    • Discrete Environment Benchmark
    • Continuous Environment Benchmark
    • Atari Environment Benchmark
    • RL GIFs
  • 🔧Development
    • Modular Design
      • Algorithm Taxonomy
      • Class Inheritance: A2C > PPO
    • Algorithm
      • DQN
      • REINFORCE
      • Actor Critic
    • Memory
      • Replay
      • PrioritizedReplay
      • OnPolicyReplay
      • OnPolicyBatchReplay
    • Net
      • MLP
      • CNN
      • RNN
    • Profiling SLM Lab
  • 📖Publications and Talks
    • Book: Foundations of Deep Reinforcement Learning
    • Talks and Presentations
  • 🤓Resources
    • Deep RL Resources
    • Contributing
    • Motivation
    • Help
    • Contact
Powered by GitBook
On this page
  • Recurrent Neural Network
  • Source Documentation
  • Example Net Spec

Was this helpful?

  1. 🔧Development
  2. Net

RNN

PreviousCNNNextProfiling SLM Lab

Last updated 5 years ago

Was this helpful?

Recurrent Neural Network

Code:

These networks take a sequence of states as input and produce one or more outputs. They consist of zero or more state processing layers (organized as an MLP). All of the states are passed through the MLP (if there is one) and the transformed states, are passed in sequence to the recurrent layer. RNNs are structured so as to retain information about a sequence of inputs. This makes them well suited to environments in which making a decision about how to act in state S would benefit from knowing which states came previously.

Source Documentation

Refer to the class documentation and example net spec from the source:

Example Net Spec

This specification instantiates a RecurrentNet with two components. First a state processing MLP with with 2 hidden layers of 256 and 128 nodes respectively and rectified linear (RELU) activations. This is followed by one recurrent layer with a hidden state of 64 units. The optimizer is with a learning rate of 0.01. The number of sequential states used as input to the networks is 4. The rest of the spec is annotated below.

{
    ...
    "agent": [{
      "net": {
        "type": "RecurrentNet",
        "shared": false,  // whether to shared networks for Actor-Critic
        "cell_type": "GRU",
        "fc_hid_layers": [256, 128],
        "hid_layers_activation": "relu",
        "out_layer_activation": null,
        "rnn_hidden_size": 64,
        "rnn_num_layers": 1,
        "bidirectional": false,  // whether to use bidirectional layer
        "seq_len": 4,
        "init_fn": "xavier_uniform_",  // weight initialization
        "clip_grad_val": 1.0,  // clip gradient by norm
        "loss_spec": {  // default loss function used for regression
          "name": "MSELoss"
        },
        "optim_spec": {  // the optimizer and its arguments
          "name": "Adam",
          "lr": 0.01
        },
        ...
      }
    }],
    ...
}

For more concrete examples of net spec specific to algorithms, refer to the existing .

slm_lab/agent/net/recurrent.py
slm_lab/agent/net/recurrent.py#L10-L71
GRU
Adam
spec files