๐Ÿ“TorchArc

Declarative YAML Network Architectures

Code: slm_lab/agent/net/torcharc_net.pyarrow-up-right ยท Library: torcharcarrow-up-right

TorchArc builds neural networks from declarative YAML specs via the torcharcarrow-up-right library. Instead of hardcoded network classes like MLPNet or ConvNet, you define the exact PyTorch modules and dataflow graph in YAML. All benchmark_arc/ specs use TorchArc (v5.1+).

YAML Structure

A TorchArc spec has two parts: modules (what layers to build) and graph (how data flows through them).

net:
  type: TorchArcNet
  shared: false
  arc:
    modules:
      body:
        Sequential:
          - LazyLinear: {out_features: 256}
          - ReLU:
          - LazyLinear: {out_features: 256}
          - ReLU:
    graph:
      input: x
      modules:
        body: [x]
      output: body
  hid_layers_activation: relu
  clip_grad_val: 0.5
  optim_spec:
    name: Adam
    lr: 3.0e-4
  gpu: auto
  • modules โ€” named groups of PyTorch modules. Each key (e.g. body) becomes a callable sub-network. Use any torch.nn module by name.

  • graph โ€” defines the input name, which modules receive which inputs, and which module produces the final output.

  • LazyLinear / LazyConv2d โ€” input dimensions are inferred automatically, so you only specify output sizes.

YAML Anchors for Compact Specs

Real benchmark specs use YAML anchors (&name / *name) to define a base config once and reuse it across environments. This is the key to keeping multi-environment specs DRY.

Define once, reuse everywhere

Merge and override

Use <<: *anchor to inherit all fields, then override only what differs. Here's the PPO Atari spec โ€” three lambda variants defined by changing a single line each:

circle-info

Why this matters: Three complete Atari specs differ by a single line each. Without anchors, each would repeat ~40 lines of identical config. When scanning the file, the differences jump out immediately.

The same pattern works for per-environment overrides in MuJoCo โ€” only the tuned hyperparameters are visible:

Before vs. after

chevron-rightWithout anchors โ€” repetitive, hard to diffhashtag

With anchors โ€” only overrides are visible:

Atari (Conv) Architecture

TorchArc handles convolutional networks the same wayโ€”list the modules explicitly:

This is the Nature CNN architecture. With TorchArc, the conv layers, flatten, and FC head are all visible in one place.

Adding Normalization Layers

Insert any torch.nn module into the Sequential list. For example, adding BatchNorm:

Or LayerNorm:

circle-info

With TorchArc, adding normalization is a one-line insertion per layerโ€”no code changes needed.

Old vs. New

MLPNet (old)
TorchArcNet (new)

Definition

hid_layers: [256, 256]

Explicit YAML modules + graph

Reuse

Copy-paste across specs

YAML anchors (& / *)

Flexibility

Fixed patterns (MLP, Conv, RNN)

Any torch.nn module

Transparency

Layers are implicit

Every layer is visible

Normalization

batch_norm: true flag

Insert BatchNorm1d, LayerNorm, etc. directly

circle-check

What's Next

Last updated

Was this helpful?