# Net

## Overview

Net classes implement neural network architectures used as function approximators in RL algorithms. SLM Lab provides flexible, swappable networks that work with any algorithm.

**Code:** [slm\_lab/agent/net](https://github.com/kengz/SLM-Lab/tree/master/slm_lab/agent/net)

## Network Types

| Type                                                                | Input            | Use Case                       | Example Environments              |
| ------------------------------------------------------------------- | ---------------- | ------------------------------ | --------------------------------- |
| [**MLPNet**](/slm-lab/development/neural-networks/mlp.md)           | Vectors          | Low-dimensional states         | CartPole, LunarLander, MuJoCo     |
| [**ConvNet**](/slm-lab/development/neural-networks/cnn.md)          | Images           | Pixel observations             | Atari games                       |
| [**RecurrentNet**](/slm-lab/development/neural-networks/rnn.md)     | Sequences        | Partial observability          | POMDPs                            |
| [**TorchArcNet**](/slm-lab/development/neural-networks/torcharc.md) | Any              | Declarative YAML architectures | All (v5.1+)                       |
| **HydraMLPNet**                                                     | Multiple vectors | Multi-head architectures       | Multi-task learning               |
| **DuelingMLPNet**                                                   | Vectors          | Q-learning                     | LunarLander (value decomposition) |
| **DuelingConvNet**                                                  | Images           | Q-learning                     | Atari (value decomposition)       |

## TorchArc Architecture (v5.1+)

[**TorchArcNet**](/slm-lab/development/neural-networks/torcharc.md) builds networks from declarative YAML specs via [torcharc](https://github.com/kengz/torcharc). Instead of implicit `hid_layers: [256, 256]`, you define exact PyTorch modules and dataflow:

```yaml
net:
  type: TorchArcNet
  arc:
    modules:
      body:
        Sequential:
          - LazyLinear: {out_features: 256}
          - ReLU:
          - LazyLinear: {out_features: 256}
          - ReLU:
    graph:
      input: x
      modules:
        body: [x]
      output: body
```

YAML anchors (`&` / `*`) eliminate copy-paste across environments—each spec shows only its overrides. All `benchmark_arc/` specs use TorchArc. See the [TorchArc page](/slm-lab/development/neural-networks/torcharc.md) for the full guide.

## Quick Selection Guide

```
Is your observation an image?
├── Yes → ConvNet
└── No → Is there partial observability?
    ├── Yes → RecurrentNet
    └── No → MLPNet
```

For Q-learning algorithms (DQN family), consider Dueling variants for better value estimation.

## Network Spec

Configure networks in the agent spec. TorchArcNet (recommended) uses YAML architecture definitions:

```yaml
agent:
  net:
    type: TorchArcNet
    arc:
      modules:
        body:
          Sequential:
            - LazyLinear: {out_features: 256}
            - ReLU:
            - LazyLinear: {out_features: 256}
            - ReLU:
      graph:
        input: x
        modules:
          body: [x]
        output: body
    hid_layers_activation: relu
    clip_grad_val: 0.5
    optim_spec:
      name: Adam
      lr: 3.0e-4
    gpu: auto
```

<details>

<summary>Legacy MLPNet spec (JSON format)</summary>

```javascript
{
  "agent": {
    "net": {
      "type": "MLPNet",
      "hid_layers": [256, 256],
      "hid_layers_activation": "relu",
      "optim_spec": {
        "name": "Adam",
        "lr": 3e-4
      },
      "clip_grad_val": 0.5,
      "gpu": "auto"
    }
  }
}
```

</details>

## Common Parameters

### Architecture

| Parameter               | Description           | Typical Values                                             |
| ----------------------- | --------------------- | ---------------------------------------------------------- |
| `type`                  | Network class         | `"TorchArcNet"`, `"MLPNet"`, `"ConvNet"`, `"RecurrentNet"` |
| `hid_layers`            | Hidden layer sizes    | `[64, 64]` (simple), `[256, 256]` (complex)                |
| `hid_layers_activation` | Activation function   | `"relu"`, `"tanh"`, `"leaky_relu"`                         |
| `out_layer_activation`  | Output activation     | `null` (none), `"tanh"`                                    |
| `init_fn`               | Weight initialization | `"orthogonal_"`, `"xavier_uniform_"`                       |

### Actor-Critic Networks

| Parameter           | Description                        | Typical Values                   |
| ------------------- | ---------------------------------- | -------------------------------- |
| `shared`            | Share weights between actor/critic | `true` (Atari), `false` (MuJoCo) |
| `use_same_optim`    | Use same optimizer for both        | `true`, `false`                  |
| `actor_optim_spec`  | Actor optimizer                    | `{"name": "Adam", "lr": 3e-4}`   |
| `critic_optim_spec` | Critic optimizer                   | `{"name": "Adam", "lr": 3e-4}`   |

### Training

| Parameter           | Description            | Typical Values                                    |
| ------------------- | ---------------------- | ------------------------------------------------- |
| `clip_grad_val`     | Gradient clipping norm | 0.5-10.0                                          |
| `loss_spec`         | Loss function          | `{"name": "MSELoss"}`, `{"name": "SmoothL1Loss"}` |
| `lr_scheduler_spec` | Learning rate schedule | See below                                         |

### ConvNet Parameters

| Parameter    | Description                                 | Default                                    |
| ------------ | ------------------------------------------- | ------------------------------------------ |
| `normalize`  | Normalize pixel input by dividing by 255    | `false`                                    |
| `batch_norm` | Apply batch normalization after conv layers | `true` (ConvNet), `false` (DuelingConvNet) |

### Device

| Parameter | Description | Values                                                |
| --------- | ----------- | ----------------------------------------------------- |
| `gpu`     | GPU usage   | `"auto"` (detect), `true` (force), `false` (CPU only) |

## Learning Rate Schedules

Decay learning rate during training:

```javascript
{
  "lr_scheduler_spec": {
    "name": "LinearToZero",  // Linear decay to 0
    "frame": 1000000         // Total frames for decay
  }
}
```

Available schedules:

* `"LinearToZero"` - Linear decay from initial LR to 0
* `"StepLR"` - Step decay at fixed intervals
* `"ExponentialLR"` - Exponential decay

## Example Specs

### MLP for CartPole/MuJoCo

```javascript
{
  "net": {
    "type": "MLPNet",
    "shared": false,
    "hid_layers": [64, 64],
    "hid_layers_activation": "tanh",
    "init_fn": "orthogonal_",
    "clip_grad_val": 0.5,
    "loss_spec": {"name": "MSELoss"},
    "actor_optim_spec": {"name": "Adam", "lr": 3e-4},
    "critic_optim_spec": {"name": "Adam", "lr": 3e-4},
    "gpu": "auto"
  }
}
```

### ConvNet for Atari (Nature CNN)

```javascript
{
  "net": {
    "type": "ConvNet",
    "shared": true,
    "conv_hid_layers": [
      [32, 8, 4, 0, 1],  // [out_channels, kernel, stride, padding, dilation]
      [64, 4, 2, 0, 1],
      [64, 3, 1, 0, 1]
    ],
    "fc_hid_layers": [512],
    "hid_layers_activation": "relu",
    "init_fn": "orthogonal_",
    "normalize": true,
    "clip_grad_val": 0.5,
    "use_same_optim": true,
    "optim_spec": {"name": "AdamW", "lr": 2.5e-4},
    "lr_scheduler_spec": {"name": "LinearToZero", "frame": 10e6},
    "gpu": "auto"
  }
}
```

### RecurrentNet for POMDPs

```javascript
{
  "net": {
    "type": "RecurrentNet",
    "cell_type": "GRU",
    "fc_hid_layers": [128],
    "rnn_hidden_size": 64,
    "rnn_num_layers": 1,
    "seq_len": 8,
    "hid_layers_activation": "relu",
    "optim_spec": {"name": "Adam", "lr": 1e-3},
    "gpu": "auto"
  }
}
```

### DQN Target Network

For DQN algorithms, a separate target network is created automatically:

```javascript
{
  "net": {
    "type": "MLPNet",
    "hid_layers": [256, 128],
    "update_type": "replace",     // How to update target
    "update_frequency": 100,      // Steps between updates
    // Or use soft updates:
    // "update_type": "polyak",
    // "polyak_coef": 0.995
  }
}
```

## Network Architecture Tips

### CartPole / Simple Control

```javascript
{"hid_layers": [64, 64], "hid_layers_activation": "tanh"}
```

* Small networks work well
* `tanh` activation is common for bounded outputs

### MuJoCo / Continuous Control

```javascript
{"hid_layers": [256, 256], "hid_layers_activation": "tanh", "init_fn": "orthogonal_"}
```

* Larger networks for complex dynamics
* Orthogonal initialization helps with gradients

### Atari / Image-Based

```javascript
{
  "conv_hid_layers": [[32, 8, 4, 0, 1], [64, 4, 2, 0, 1], [64, 3, 1, 0, 1]],
  "fc_hid_layers": [512]
}
```

* Nature CNN architecture is standard
* `shared: true` for actor-critic

### Box2D (LunarLander, BipedalWalker)

```javascript
{"hid_layers": [256, 128], "hid_layers_activation": "relu"}
```

* Medium-sized networks
* `relu` works well

## GPU Usage

SLM Lab handles GPU placement automatically:

```javascript
{"gpu": "auto"}  // Use GPU if available, else CPU
{"gpu": true}    // Force GPU (fails if unavailable)
{"gpu": false}   // Force CPU
{"gpu": 0}       // Specific GPU device
```

For multi-GPU setups, see [GPU Usage: PPO on Pong](/slm-lab/using-slm-lab/gpu-usage-ppo-on-pong.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://slm-lab.gitbook.io/slm-lab/development/neural-networks.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
