Permission denied when running bin/setup

This means you don't have sufficient privilege on your machine. Run it with sudo:

sudo ./bin/setup

conda activate lab fails

When Conda complains about certain variables should not be in your PATH:

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'. If your shell is Bash or a Bourne variant, enable conda for the current user with

$ echo ". /home/ubuntu/miniconda3/etc/profile.d/" >> ~/.bashrc

or, for all users, enable conda with

$ sudo ln -s /home/ubuntu/miniconda3/etc/profile.d/ /etc/profile.d/

The options above will permanently enable the 'conda' command, but they do NOT put conda's base (root) environment on PATH. To do so, run

$ conda activate

in your terminal, or to put the base environment on PATH permanently, run

$ echo "conda activate" >> ~/.bashrc

Previous to conda 4.4, the recommended way to activate conda was to modify PATH in your ~/.bashrc file. You should manually remove the line that looks like

export PATH="/home/ubuntu/miniconda3/bin:$PATH"

^^^ The above line should NO LONGER be in your ~/.bashrc file! ^^^

To fix it, do the first thing it recommends and refresh your terminal session:

echo ". /home/ubuntu/miniconda3/etc/profile.d/" >> ~/.bashrc
source ~/.bashrc

Google Colab / Jupyter setup

For users of Google Colab or Jupyter, simply use the Conda environment lab as the kernel setup by SLM Lab installation. SLM Lab setup installs Conda into the home directory ~/miniconda3. Note that in each notebook cell a bash command is a entirely new session. We have to expose the lab Conda environment directly and run the Python command. Furthermore, note that notebooks have no GUI thus have to be run headless. The following is an example for running the quickstart:

# since each shell is a new bash session, this sources the Conda environment directly
export PATH=~/miniconda3/envs/lab/bin:$PATH
# and we run it in headless mode (Colab has no GUI)
# NOTE since each cell evaluates as a session,
# the logs will only be printed in the cell output when the command is finished,
# i.e. logs don't stream in here, so wait a few minutes to see the output
xvfb-run -a python slm_lab/spec/demo.json dqn_cartpole dev

Please find an example Colab notebook here, credit to @piosif97 for the initiative and discussion that led to it.

GLIBCXX_3.4.21version errors due to gcc, g++, libstdc++

You encounter libgcc errors like:

ImportError: /home/deploy/miniconda3/envs/lab/lib/python3.6/site-packages/torch/../../.././ version `GLIBCXX_3.4.21' not found (required by /home/deploy/miniconda3/envs/lab/lib/python3.6/site-packages/ray/pyarrow_files/pyarrow/

Try installing libgcc in Conda:

  conda install libgcc

NVIDIA GPU driver problem

If you receive errors similar to the following when trying to use GPU:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

Reinstall your NVIDIA GPU driver using this instruction.

Building and setting up a Linux GPU server

If you build your own desktop and want a quick and smooth setup for a Ubuntu GPU server, refer to this gist.

Breakage from SLM-Lab update

Make sure you also install the packages after updating the repo. Run:

git pull

Search is running slow

In certain setup, the search mode's parallel processing may run slower because of race condition in PyTorch's greedy CPU utilization. This is indicated when the logged fps (frame-per-second) is much slower in search than when simply training a trial, e.g. fps 200 vs 10.

This issue is documented here:

To fix it, prepend an OMP_NUM_THREADS=1 to the run command. For example:

OMP_NUM_THREADS=1 python slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_cartpole search

JSON parsing issue in spec

Newer dependencies of SLM Lab may cause issues when parsing JSON spec files. SLM Lab uses a looser JSON syntax which includes comma in the last element of enumerable. If you encounter a JSON parsing issue, simply edit the spec file to remove these extraneous commas.

Vizdoom installation fails or not found

Manually install it:

conda activate lab
sudo pip install vizdoom

How to kill stuck processes?

You can see the running processes using tools like glances. Use the following commands to kill processes by their names. You may need to use sudo.

pkill -f run_lab
pkill -f slm-env
pkill -f ipykernel
pkill -f ray
pkill -f orca
pkill -f Xvfb
ps aux | grep -i Unity | awk '{print $2}' | xargs sudo kill -9

No GUI or images saved on a headless remote server

When running SLM Lab on a remote server, you may get NoSuchDisplayException: Cannot connect to "None". Or your graphs may not be generated. This is because servers are typically headless, i.e. without a display. This error occurs when you're trying to render without a headless display.

First, try setting environment variable RENDER=false before the lab command, for example:

RENDER=false python slm_lab/spec/demo.json dqn_cartpole train

Despite its simplicity, this option comes with the caveat that plots from Plotly cannot generated. The safer option is to install Xvfb, and prepend your command with xvfb-run -a. For example:

xvfb-run -a python slm_lab/spec/demo.json dqn_cartpole train

How to forward GUI from a remote server?

If you are running via ssh and want GUI forwarding from a server, do:

How to sync data from a remote server?

SLM Lab produces a lot of data which are then zipped for our convenience of transferring/syncing them. We use Dropbox to upload these zip files. Follow this instruction to install Dropbox CLI.

What is SLM?

SLM stands for Strange Loop Machine, in homage to Hofstadter’s iconic book Gödel, Escher, Bach: An Eternal Golden Braid. This lab is created as part of a long term project to try out AI ideas heavily influenced by it.

Reporting Issues

Can't find the issues you encountered? Report new issues on Github; it helps all of us.

Last updated