๐Ÿ“ŠGraphs and Data

SLM Lab automatically generates graphs showing how your agent learns over time. These help you:

  • Track progress: See if rewards are increasing

  • Compare runs: Check if different random seeds give consistent results

  • Tune hyperparameters: Compare different settings to find what works best

Graph Types

SLM Lab produces graphs at each level of the hierarchy:

Level
What It Shows
Use Case

Session

Single training run

Debug individual runs

Trial

Average of 4 sessions ยฑ std

Report reliable results

Experiment

Multiple trials overlaid

Compare hyperparameters

Visual Comparison

Raw Graph
Moving Average

Row 1 - Session graphs: Single training run. Noisy but shows real-time progress.

Row 2 - Trial graphs: Average across 4 sessions with error bands (ยฑ1 std). This is the standard format for reporting RL resultsโ€”shows both performance and consistency.

Row 3 - Experiment graphs: Multiple trials overlaid for comparison. Each line is a different hyperparameter configuration.

circle-info

The moving average (MA) graphs smooth the data using a 100-checkpoint window. These are easier to read and are typically used for publications.

Reading the Graphs

Axes

  • X-axis (frames): Total environment steps. With num_envs=4, 1M frames = 250K steps per env.

  • Y-axis (returns): Episode reward. Higher is better.

Error Bands

Trial graphs show shaded regions representing ยฑ1 standard deviation across sessions. Narrower bands = more consistent algorithm.

What to Look For

Pattern
Meaning

Steady upward trend

Algorithm is learning

Plateau

May need more training or hit environment limit

High variance

Consider more sessions or longer training

Collapse (sudden drop)

Possible policy collapseโ€”use "best" checkpoint

Experiment Analysis

Multi-Trial Graph

When running experiments (hyperparameter search), the multi-trial graph overlays all trials:

Each color represents a different hyperparameter configuration. This quickly shows which settings work best.

Experiment Variable Graph

This graph plots final performance vs. hyperparameter values:

  • X-axis: Hyperparameter value (e.g., lambda)

  • Y-axis: Performance metric (e.g., strength)

  • Color: Overall trial quality (darker = better)

This reveals relationships between hyperparameters and performanceโ€”useful for understanding algorithm sensitivity.

Experiment DataFrame

The experiment data is also saved as experiment_df.csv:

Key features:

  • Sorted best-first: Top row is the best configuration

  • All hyperparameters: Shows what values were tried

  • All metrics: Strength, efficiency, stability, consistency

Graph File Locations

After a run, find graphs in data/{spec_name}_{timestamp}/:

Note: Trial-level graphs are in the root folder; session-level graphs are in graph/.

Interactive Graphs

SLM Lab generates both PNG (static) and HTML (interactive) graphs using Plotly. The HTML versions support:

  • Zoom: Click and drag to zoom into regions

  • Hover: See exact values at any point

  • Pan: Shift+drag to move around

  • Reset: Double-click to reset view

Open the HTML files in any browser for interactive exploration.

Advanced: Additional Graphs

The graph/ folder also contains training diagnostic graphs:

Graph
What It Shows

*_session_graph_*_loss_vs_frame.png

Training loss over time

*_session_graph_*_entropy_vs_frame.png

Policy entropy (exploration)

*_session_graph_*_explore_var_vs_frame.png

Exploration parameter (if applicable)

These are useful for debugging training issues.

Regenerating Graphs

To regenerate graphs with updated styling:

This recomputes all derived data and graphs without re-running training.

Last updated

Was this helpful?